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Preface 


When mathematical modelling is used to describe physical, biological or chemical 
phenomena, one of the most common results is either a differential equation or 
a system of differential equations, together with appropriate boundary and initial 
conditions. These differential equations may be ordinary or partial, and finding 
and interpreting their solution is at the heart of applied mathematics. A thorough 
introduction to differential equations is therefore a necessary part of the education 
of any applied mathematician, and this book is aimed at building up skills in this 
area. For similar reasons, the book should also be of use to mathematically-inclined 
physicists and engineers. 

Although the importance of studying differential equations is not generally in 
question, exactly how the theory of differential equations should be taught, and 
what aspects should be emphasized, is more controversial. In our experience, text- 
books on differential equations usually fall into one of two categories. Firstly, there 
is the type of textbook that emphasizes the importance of abstract mathematical 
results, proving each of its theorems with full mathematical rigour. Such textbooks 
are usually aimed at graduate students, and are inappropriate for the average un- 
dergraduate. Secondly, there is the type of textbook that shows the student how 
to construct solutions of differential equations, with particular emphasis on algo- 
rithmic methods. These textbooks often tackle only linear equations, and have no 
pretension to mathematical rigour. However, they are usually well-stocked with 
interesting examples, and often include sections on numerical solution methods. 

In this textbook, we steer a course between these two extremes, starting at the 
level of preparedness of a typical, but well-motivated, second year undergraduate 
at a British university. As such, the book begins in an unsophisticated style with 
the clear objective of obtaining quantitative results for a particular linear ordi- 
nary differential equation. The text is, however, written in a progressive manner, 
with the aim of developing a deeper understanding of ordinary and partial differ- 
ential equations, including conditions for the existence and uniqueness of solutions, 
solutions by group theoretical and asymptotic methods, the basic ideas of con- 
trol theory, and nonlinear systems, including bifurcation theory and chaos. The 
emphasis of the book is on analytical and asymptotic solution methods. However, 
where appropriate, we have supplemented the text by including numerical solutions 
and graphs produced using MATLABf, version 6. We assume some knowledge of 


f MATLAB is a registered trademark of The MathWorks, Inc. 
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MATLAB (summarized in Appendix 7), but explain any nontrivial aspects as they 
arise. Where mathematical rigour is required, we have presented the appropriate 
analysis, on the basis that the student has taken first courses in analysis and linear 
algebra. We have, however, avoided any functional analysis. Most of the material 
in the book has been taught by us in courses for undergraduates at the University 
of Birmingham. This has given us some insight into what students find difficult, 
and, as a consequence, what needs to be emphasized and re-iterated. 

The book is divided into two parts. In the first of these, we tackle linear differ- 
ential equations. The first three chapters are concerned with variable coefficient, 
linear, second order ordinary differential equations, emphasizing the methods of 
reduction of order and variation of parameters, and series solution by the method 
of Frobenius. In particular, we discuss Legendre functions (Chapter 2) and Bessel 
functions (Chapter 3) in detail, and motivate this by giving examples of how they 
arise in real modelling problems. These examples lead to partial differential equa- 
tions, and we use separation of variables to obtain Legendre’s and Bessel’s equa- 
tions. In Chapter 4, the emphasis is on boundary value problems, and we show 
how these differ from initial value problems. We introduce Sturm-Liouville theory 
in this chapter, and prove various results on eigenvalue problems. The next two 
chapters of the first part of the book are concerned with Fourier series, and Fourier 
and Laplace transforms. We discuss in detail the convergence of Fourier series, since 
the analysis involved is far more straightforward than that associated with other 
basis functions. Our approach to Fourier transforms involves a short introduction 
to the theory of generalized functions. The advantage of this approach is that a 
discussion of what types of function possess a Fourier transform is straightforward, 
since all generalized functions possess a Fourier transform. We show how Fourier 
transforms can be used to construct the free space Green’s function for both ordi- 
nary and partial differential equations. We also use Fourier transforms to derive 
the solutions of the Dirichlet and Neumann problems for Laplace’s equation. Our 
discussion of the Laplace transform includes an outline proof of the inversion the- 
orem, and several examples of physical problems, for example involving diffusion, 
that can be solved by this method. In Chapter 7 we discuss the classification of 
linear, second order partial differential equations, emphasizing the reasons why the 
canonical examples of elliptic, parabolic and hyperbolic equations, namely Laplace’s 
equation, the diffusion equation and the wave equation, have the properties that 
they do. We also consider complex variable methods for solving Laplace’s equation, 
emphasizing their application to problems in fluid mechanics. 

The second part of the book is concerned with nonlinear problems and more 
advanced techniques. Although we have used a lot of the material in Chapters 9 
and 14 (phase plane techniques and control theory) in a course for second year 
undergraduates, the bulk of the material here is aimed at third year students. We 
begin in Chapter 8 with a brief introduction to the rigorous analysis of ordinary 
differential equations. Here the emphasis is on existence, uniqueness and com- 
parison theorems. In Chapter 9 we introduce the phase plane and its associated 
techniques. This is the first of three chapters (the others being Chapters 13 and 15) 
that form an introduction to the theory of nonlinear ordinary differential equations, 
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often known as dynamical systems. In Chapter 10, we show how the ideas of group 
theory can be used to find exact solutions of ordinary and partial differential equa- 
tions. In Chapters 11 and 12 we discuss the theory and practice of asymptotic 
analysis. After discussing the basic ideas at the beginning of Chapter 11, we move 
on to study the three most important techniques for the asymptotic evaluation of 
integrals: Laplace’s method, the method of stationary phase and the method of 
steepest descents. Chapter 12 is devoted to the asymptotic solution of differential 
equations, and we introduce the method of matched asymptotic expansions, and 
the associated idea of asymptotic matching, the method of multiple scales, includ- 
ing Kuzmak’s method for analysing the slow damping of nonlinear oscillators, and 
the WKB expansion. We illustrate each of these methods with a wide variety of 
examples, for both nonlinear ordinary differential equations and partial differential 
equations. In Chapter 13 we cover the centre manifold theorem, Lyapunov func- 
tions and an introduction to bifurcation theory. Chapter 14 is about time-optimal 
control theory in the phase plane, and includes a discussion of the controllability 
matrix and the time-optimal maximum principle for second order linear systems of 
ordinary differential equations. Chapter 15 is on chaotic systems, and, after some 
illustrative examples, emphasizes the theory of homoclinic tangles and Mel’nikov 
theory. 

There is a set of exercises at the end of each chapter. Harder exercises are 
marked with a star, and many chapters include a project, which is rather longer 
than the average exercise, and whose solution involves searches in the library or on 
the Internet, and deeper study. Bona fide teachers and instructors can obtain full 
worked solutions to many of the exercises by emailing solutions@cambridge.org. 

In order to follow many of the ideas and calculations that we describe in this 
book, and to fully appreciate the more advanced material, the reader may need 
to acquire (or refresh) some basic skills. These are covered in the appendices, 
and fall into six basic areas: linear algebra, continuity and differentiability, power 
series, sequences and series of functions, ordinary differential equations and complex 
variables. 

We would like to thank our friends and colleagues, Adam Burbidge (Nestle Re- 
search Centre, Lausanne), Norrie Everitt (Birmingham), Chris Good (Birming- 
ham), Ray Jones (Birmingham), John King (Nottingham), Dave Needham (Read- 
ing), Nigel Scott (East Anglia) and Warren Smith (Birmingham), who read and 
commented on one or more chapters of the book before it was published. Any 
nonsense remaining is, of course, our fault and not theirs. 


ACK, JB and SRO, Birmingham 2002 
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Linear Equations 
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CHAPTER ONE 


Variable Coefficient, Second Order, Linear, 
Ordinary Differential Equations 


Many physical, chemical and biological systems can be described using mathemat- 
ical models. Once the model is formulated, we usually need to solve a differential 
equation in order to predict and quantify the features of the system being mod- 
elled. As a precursor to this, we consider linear, second order ordinary differential 
equations of the form 

+ < 3 ( a; )^ + R ( x )y = F ( x . ). 

with P(x) 1 Q{x) and R{x) finite polynomials that contain no common factor. This 
equation is inhomogeneous and has variable coefficients. The form of these poly- 
nomials varies according to the underlying physical problem that we are studying. 
However, we will postpone any discussion of the physical origin of such equations 
until we have considered some classical mathematical models in Chapters 2 and 3. 

After dividing through by P{x), we obtain the more convenient, equivalent form, 

^jr + + a o( x ) y = f( x )- fi- 1 ) 

This process is mathematically legitimate, provided that P(x) y 0. If P(x o) = 0 
at some point x = xq, it is not legitimate, and we call Xq a singular point of the 
equation. If P(x o) y 0, xo is a regular or ordinary point of the equation. If 
P{x) 0 for all points x in the interval where we want to solve the equation, we 
say that the equation is nonsingular, or regular, on the interval. 

We usually need to solve (1.1) subject to either initial conditions of the form 
y(a) = a, y'(a) = f3 or boundary conditions, of which y(a) = a and y(b) = (3 
are typical examples. It is worth reminding ourselves that, given the ordinary dif- 
ferential equation and initial conditions (an initial value problem), the objective 
is to determine the solution for other values of x, typically, x > a, as illustrated in 
Figure 1.1. As an example, consider a projectile. The initial conditions are the po- 
sition of the projectile and the speed and angle to the horizontal at which it is fired. 
We then want to know the path of the projectile, given these initial conditions. 

For initial value problems of this form, it is possible to show that: 

(i) If a±(x), ao(x) and f(x) are continuous on some open interval I that contains 
the initial point a, a unique solution of the initial value problem exists on 
the interval /, as we shall demonstrate in Chapter 8. 
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Fig. 1.1. An initial value problem. 


(ii) The structure of the solution of the initial value problem is of the form 

y = A m (x) + B U2 {x) + G( x), 

Complementary function Particular integral 


where A, B are constants that are fixed by the initial conditions and u\(x) 
and U2(x) are linearly independent solutions of the corresponding homoge- 
neous problem y" + a\{x)y' + ao(x)y = 0. 

These results can be proved rigorously, but nonconstructively, by studying the 
operator 



ai{x) 


dy 

dx 


a o(x)y, 


and regarding L : C 2 (I) — » C°(I) as a linear transformation from the space of 
twice-differentiable functions defined on the interval I to the space of continuous 
functions defined on I. The solutions of the homogeneous equation are elements 
of the null space of L. This subspace is completely determined once its basis 
is known. The solution of the inhomogeneous problem, Ly = /, is then given 
formally as y = L~ l f . Unfortunately, if we actually want to construct the solution 
of a particular equation, there is a lot more work to do. 

Before we try to construct the general solution of the inhomogeneous initial value 
problem, we will outline a series of subproblems that are more tractable. 
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1.1 The Method of Reduction of Order 

As a first simplification we discuss the solution of the homogeneous differential 
equation 

^ +0l ^S +a °^ 2/ = 0, ( L2 ) 

on the assumption that we know one solution, say y{x) = u\{x), and only need to 
find the second solution. We will look for a solution of the form y(x ) = U(x)u\{x). 
Differentiating y(x) using the product rule gives 

dy^ _dU_ du i 

dx dx Ul dx 


d 2 y d 2 U dU du\ 

dx 2 dx 2 Ul dx dx 


U 


d 2 u\ 

dx 2 


If we substitute these expressions into (1.2) we obtain 

UU 

V dx 


d 2 U dU dui d 2 Ui 

ui + 2 — — + U + ai(x) 
dx z dx dx dx z 


( dU dui 

I -^u\ + U— — 
dx 


ao(x)Uu\ = 0. 


We can now collect terms to get 


U 


d 2 u\ 

dx 2 


du\ 


dx 


d 2 U dU /' n du 


+ a\(x)— 1 - a 0 (a;)u 1 + ui-r- 5 - + -r~ 2 — 1 - aiui = 0 . 


dx 2 dx 


dx 


Now, since u±(x) is a solution of (1.2), the term multiplying U is zero. We have 
therefore obtained a differential equation for dU /dx, and, by defining Z = dU /dx, 
have 

dZ „ / du-i , 

Ul- h Z I 2— h CI 1 U 1 ) = 0. 

dx V dx 


Dividing through by Zu\ we have 


1 dZ 2 diii 

1 7 b = 0, 

Z dx u\ dx 

which can be integrated directly to yield 

log \Z\ + 21 og |ui| + J ai(s)ds = C, 
where s is a dummy variable, for some constant C . Thus 

z = -k esp {-J = f 

where c = e c . This can then be integrated to give 

U(x) = J exp J ai(s) ds^j dt + c, 
for some constant c. The solution is therefore 
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y 0 ) 



ai(s) ds 


dt + cu \{x). 


We can recognize cu\(x) as the part of the complementary function that we knew 
to start with, and 

f x 1 

u 2 {x) = ui(x) / exp 

as the second part of the complementary function. This result is called the reduc- 
tion of order formula. 


ai(s)ds > dt 


(1.3) 


Example 

Let’s try to determine the full solution of the differential equation 


2\ d V 


(! - x ) —5 - 2 x— + 2 y = 0, 
dx z dx 

given that y = u\{x) = x is a solution. We firstly write the equation in standard 
form as 


d 2 y 2 x dy 


-,y = 0 . 


dx 2 1 — x 2 dx 1 — x 2 ' 

Comparing this with (1.2), we have a\{x) = — 2x/(l — x 2 ). After noting that 

/■* rt 2s 

/ ai(s)ds= ~ 1 _ ds = log(l - t 2 ), 

the reduction of order formula gives 

t x 1 r x dt 

u 2 (x) = xj -p exp { — log(l — t 2 )} dt = x J f2 ^_ t2 y 

We can express the integrand in terms of its partial fractions as 
1 1111 1 


t 2 {l-t 2 ) t 2 1 -t 2 t 2 2(1 + t) 2(1 — t ) ' 

This gives the second solution of (1.2) as 

11 1 
t 2 + 2(1 + f) + W^t) 


u 2 (x) = X J 


dt 


= X 


1 1 


: l°g 


t 2 

and hence the general solution is 


1 + 1 
1 - t 


x 1 

= 2 log 


1 + x 

1 — X 


- 1 , 


x 1 

2 l0g 


1 + x 

1 — X 


- 1 


y = Ax + B 
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r 


1.2 The Method of Variation of Parameters 


Let’s now consider how to find the particular integral given the complementary 
function, comprising u\{x) and u 2 ( 2 ). As the name of this technique suggests, we 
take the constants in the complementary function to be variable, and assume that 

y = Ci (x)ui(x) + c 2 (x)u 2 (x). 

Differentiating, we find that 


dy 


du\ 
dx dx 


Ui 


dc\ 

dx 


c 2 


du 2 

dx 


u 2 


dc 2 

dx 


We will choose to impose the condition 

dci 


u 1 


dx 


dc 2 

U 2 = 0 , 


dx 


(1.4) 


and thus have 


du\ 


dy _ 
dx dx 


c 2 


du 2 

dx 


which, when differentiated again, yields 


d 2 y 


d 2 u\ 


dx 2 Cl dx 2 


dui dc\ 
dx dx 


c 2 


d 2 u 2 

dx 2 


du 2 dc 2 
dx dx 


This form can then be substituted into the original differential equation to give 


Cl 


d 2 u\ 


du\ dc\ 
dx dx 


dx 2 ' dx dx ' ° 2 dx 2 
This can be rearranged to show that 


d 2 u 2 


du 2 dc 2 
dx dx 


+ «i Ui 


dui 

dx 


c 2 


du 2 

dx 


+ a 0 (ciui + c 2 u 2 ) = /. 


Cl 


d 2 u\ 

dx 2 


a 1 


dui 

dx 


a 0 u 1 


c 2 


d 2 u 2 

dx 2 


a 1 


du 2 

dx 


a 0 u 2 


du\ dci + du 2 dc 2 
dx dx dx dx 


Since ui and u 2 are solutions of the homogeneous equation, the first two terms are 
zero, which gives us 


dui dc\ + du 2 dc 2 
dx dx dx dx 


(1.5) 


We now have two simultaneous equations, (1.4) and (1.5), for c[ = dci/dx and 
c' 2 = dc 2 /dx, which can be written in matrix form as 


Ml m 2 


Mi 


These can easily be solved to give 


c i 


Ci = - 


fu 2 
W ’ 


C 9 — 


fui 

W 


where 


W = Ml M 2 — M 2 Mi = 


Ml m 2 


Mi 


M, 
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is called the Wronskian. These expressions can be integrated to give 


ci = 


f(s)u 2 {s ) 


ds + A, c 2 = 


f(s)ui(s) 


W(s) ’ - J W{s) 

We can now write down the solution of the entire problem as 
f(s)u 2 (s) , , ^ , / >x /(s)mi(s) 


ds + B. 


, \ / \ f f\ s ) u 2 ( 5 ) , \ f f(s)Ui(s) , \ , x 

y(x) = u\{x) J ds + u 2 (x) J V 7 ( s ) ds + Au i( x ) + Bu z( x ) ■ 


W(s) 

The particular integral is therefore 


y{x) = J f(s) 


Ul(s)u 2 (x) - Ui(x)u 2 (s) 
W{s) 


ds. 


(1.6) 


This is called the variation of parameters formula. 

Example 


Consider the equation 


d~y 

—tt + y = isini. 
dx z 


The homogeneous form of this equation has constant coefficients, with solutions 

u\{x) = cos a:, 112 ( 2 :) = sin x. 

The variation of parameters formula then gives the particular integral as 

cos s sin x — cos x sin s 


y= s sm s 


1 


ds, 


since 


W = 


cos x sm x 
— sin x cos x 


= cos 2 x + sin 2 x = 1. 


We can split the particular integral into two integrals as 

/ X />X 

s sin s cos sds — cos x s sin 2 s ds 

1 [ x 1 f x 

= - sin x s sin 2 sds — - cos x s (1 — cos 2s) ds. 

Using integration by parts, we can evaluate this, and find that 

1 2 1 . 1 

y(x) = —-x cos x + -x sm x + - cos x 
4 4 8 

is the required particular integral. The general solution is therefore 

1 2 1 . 

y = ci cos x + c 2 sm x — -x cos x + -x sm x. 

Although we have given a rational derivation of the reduction of order and vari- 
ation of parameters formulae, we have made no comment so far about why the 
procedures we used in the derivation should work at all! It turns out that this has 
a close connection with the theory of continuous groups, which we will investigate 
in Chapter 10. 



1.2 THE METHOD OF VARIATION OF PARAMETERS 


1.2.1 The Wronskian 

Before we carry on, let’s pause to discuss some further properties of the Wronskian. 
Recall that if V is a vector space over K., then two elements vj, V2 £ V are linearly 
dependent if 3 aq, «2 £ K., with op and 0:2 not both zero, such that aiVi+a2'V2 = 0. 

Now let V = C 1 (a, b) be the set of once-differentiable functions over the interval 
a < x < b. If U\ , U‘2 £ C 1 (a,b) are linearly dependent, 3 01,02 £ K such that 
a\Ui{x) + 0:2112(2:) = 0 Va; £ ( a,b ). Notice that, by direct differentiation, this also 
gives a\ u'\{x) + ot2u! 2 {x) = 0 or, in matrix form, 

f Ui(x) u 2 (x) \ f cti \ _ / 0 \ 

V U 'l( X ) u 2( x ) ) \U2 ) ~\0 )■ 

These are homogeneous equations of the form 

Ax = 0, 


which only have nontrivial solutions if det(A) = 0, that is 


ui(x) u 2 (x) 
u[(x) u' 2 {x) 


= U\U 2 — u\u2 = 0 . 


In other words, the Wronskian of two linearly dependent functions is identically 
zero on (a, b). The contrapositive of this result is that if W ^ 0 on (a, 6), then u\ 
and U2 are linearly independent on (a, b). 


Example 

The functions 111(21) = x 2 and 112(2:) = x 3 are linearly independent on the interval 
(—1,1). To see this, note that, since 111(2;) = x 2 , 112 (x) = x 3 , u[(x) = 2 x, and 
u 2 (x) = 3 x 2 , the Wronskian of these two functions is 


W = 


x 2 

2 x 



= 3a; 4 — 2a: 4 = a; 4 . 


This quantity is not identically zero, and hence x 2 and x 3 are linearly independent 
on (-1,1). 


Example 

The functions 111(2;) = /( x) and u 2 (x) = kf( x), with k a constant, are linearly 
dependent on any interval, since their Wronskian is 


IT = 


/ kf 

r kf 


= 0. 


If the functions iq and u 2 are solutions of (1.2), we can show by differentiating 
IT = u\u ' 2 — u'iii2 directly that 


— + ai (x)W = 0 . 
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This first order differential equation has solution 


W(x) 


W(x o) exp < — / a\ (t)dt 

L Jx 0 


(1.7) 


which is known as Abel’s formula. This gives us an easy way of finding the 
Wronskian of the solutions of any second order differential equation without having 
to construct the solutions themselves. 


Example 

Consider the equation 

y” + -y' + f 1 - y = o. 

X \ X z J 

Using Abel’s formula, this has Wronskian 

T xrf ^ Txr( A \ f x dt \ x 0 W(x 0 ) A 

W ( X ) = ffMexp = — — = - 

for some constant A. To find this constant, it is usually necessary to know more 
about the solutions u±(x) and u 2 (x). We will describe a technique for doing this in 
Section 1.3. 


We end this section with a couple of useful theorems. 

Theorem 1.1 If tq and u 2 are linearly independent solutions of the homoge- 
neous, nonsingular ordinary differential equation (1.2), then the Wronskian is either 
strictly positive or strictly negative. 

Proof From Abel’s formula, and since the exponential function does not change 
sign, the Wronskian is identically positive, identically negative or identically zero. 
We just need to exclude the possibility that W is ever zero. Suppose that W(x\) = 

hence iq(aq) = ku 2 (x\) and u[( x) = ku' 2 (x ) for some constant k. The function 
u( x) = Ui(x) — ku 2 (x) is also a solution of (1.2) by linearity, and satisfies the initial 
conditions u(x i) = 0, u'(x i) = 0. Since (1.2) has a unique solution, the obvious 
solution, u = 0, is the only solution. This means that u\ = ku 2 - Hence u\ and u 2 
are linearly dependent ~ a contradiction. □ 

The nonsingularity of the differential equation is crucial here. If we consider the 
equation x 2 y" — 2xy' + 2y = 0, which has u±(x) = x 2 and u 2 (x) = x as its linearly 
independent solutions, the Wronksian is — x , which vanishes at x = 0. This is 
because the coefficient of y" also vanishes at x = 0. 

Theorem 1.2 (The Sturm separation theorem) If u±(x) and u 2 (x) are the 
linearly independent solutions of a nonsingular, homogeneous equation, (1.2), then 


u 2 (xi) 
u' 2 (x l) 


are then linearly dependent, and 


0. The vectors 


ui(xi) 

u[(xi) 


and 
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the zeros of Ui(x) and u 2 (x) occur alternately. In other words, successive zeros of 
u\{x) are separated by successive zeros of Ui{x) and vice versa. 

Proof Suppose that X\ and X 2 are successive zeros of U 2 (x), so that W(xi) = 
ui(xi)u' 2 (xi) for * = 1 or 2. We also know that W(x) is of one sign on [x\,X 2 [, 
since u±(x) and U 2 (x) are linearly independent. This means that u\(xi) and u' 2 (xi) 
are nonzero. Now if u 2 (x 1 ) is positive then u 2 (x 2 ) is negative (or vice versa), since 
U‘2 (x‘2) is zero. Since the Wronskian cannot change sign between x\ and X2, Ui(x) 
must change sign, and hence U\ has a zero in [xi,x 2 ], as we claimed. □ 

As an example of this, consider the equation y" +u 2 y = 0, which has solution y = 
A sin ujx + B cos uix. If we consider any two of the zeros of sin uix, it is immediately 
clear that cosoiai has a zero between them. 


1.3 Solution by Power Series: The Method of Frobenius 


Up to this point, we have considered ordinary differential equations for which we 
know at least one solution of the homogeneous problem. From this we have seen that 
we can easily construct the second independent solution and, in the inhomogeneous 
case, the particular integral. We now turn our attention to the more difficult 
case, in which we cannot determine a solution of the homogeneous problem by 
inspection. We must devise a method that is capable of solving variable coefficient 
ordinary differential equations in general. As we noted at the start of the chapter, 
we will restrict our attention to the case where the variable coefficients are simple 
polynomials. This suggests that we can look for a solution of the form 


V = x c ^2a n x n = y ^a n x n+c , 

n = 0 n — 0 


and hence 


dx 


y^a n (n + c) x n+c \ 

n = 0 


(1.8) 


(1.9) 


d 2 y 

dx 2 


y ^2,a n {n + c){n + c— l)x n+c 2 , 

n — 0 


( 1 . 10 ) 


where the constants c, ao,ai, . . . , are as yet undetermined. This is known as the 
method of Frobenius. Later on, we will give some idea of why and when this 
method can be used. For the moment, we will just try to make it work. We proceed 
by example, with the simplest case first. 


1.3.1 The Roots of the Indicial Equation Differ by an Integer 

Consider the equation 




1 

4 


2 cPy dy_ 
dx 2 X dx 


y = 0 . 


( 1 . 11 ) 
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We substitute (1.8) to (1.10) into (1.11), which gives 

OO OO 

x 2 ^^a n (n + c)(n + c — l)x n+c ~ 2 + x^^a n {n + c)a ,ri+c_1 

n — 0 n=0 


-i E^ B+C = °- 


n — 0 


We can rearrange this slightly to obtain 


OO / ^ \ OO 

l ( n + c)(n + c — 1) + (n + c) — - > x n+c + y ^a n x 

n = 0 ^ ' n = 0 

and hence, after simplifying the terms in the first summation, 

OO / . \ OO 

^2a n i (n + c) 2 - - l x n+c + ^2a n x n+c+2 = 0. 


n+c +2 _ q 


n — 0 


n=0 


We now extract the first two terms from the first summation to give 


a o ( — yi x c + ai\(c + l) 2 — - 


„c+i 


OO / ^ \ OO 

+y^a„ l (n + c) 2 - - > x n+c + x n+c+2 = 0. (1.12) 

n— 2 ^ n=0 

Notice that the first term is the only one containing x c and similarly for the second 
term in x c+1 . 

The two summations in (1.12) begin at the same power of x, namely x 2+c . If we 
let m = n + 2 in the last summation (notice that if n = 0 then m = 2, and n = oo 
implies that m = oo), (1.12) becomes 

° 0 ( C 2 “ i) xC + ai |( c+1 ) 2 - \ | x ° +1 


(n + c) 2 - - l x n+c + y^a m _ 2 x m+c = 0. 

n = 2 ' ' m — 2 

Since the variables in the summations are merely dummy variables, 

OO OO 

J2am-2X m+C =J2 a n-2X n+ °, 

771=2 71=2 


a ° — ^) a;C + ai {( c + 1 ) 2 — x ° +1 

OO / -* n oo 

+^2 a n Un + c) 2 - - l xn+C + ^2a n -2X n+c = 0. 

77=2 ^ ' 71=2 


and hence 
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Since the last two summations involve identical powers of x , we can combine them 
to obtain 


a o 



x c + a± 




x c+1 


E 

n — 2 


a„. <, (n + c) - - \ + a n -2 


n+c 


= o. 


(1.13) 


Although the operations above are straightforward, we need to take some care to 
avoid simple slips. 

Since (1.13) must hold for all values of x, the coefficient of each power of x must 
be zero. The coefficient of x c is therefore 


a o 



= 0. 


Up to this point, most Frobenius analysis is very similar. It is here that the different 
structures come into play. If we were to use the solution a 0 = 0, the series (1.8) 
would have ai£ c+1 as its first term. This is just equivalent to increasing c by 1. We 
therefore assume that a o yf 0, which means that c must satisfy c 2 — | = 0. This is 
called the indicial equation, and implies that c = ±^. Now, progressing to the 
next term, proportional to x c+1 , we find that 

cn j(c+l) 2 - |j =0. 


Choosing c = -)■ gives a± = 0, and, if we were to do this, we would find that we had 
constructed a solution with one arbitrary constant. However, if we choose c = — ^ 
the indicial equation is satisfied for arbitrary values of and cq will act as the 
second arbitrary constant for the solution. In order to generate this more general 
solution, we therefore let c = 

We now progress to the individual terms in the summation. The general term 
yields 


a n 




a n - 2 = 0 for n = 2, 3, . . . . 


This is called a recurrence relation. We solve it by observation as follows. We 
start by rearranging to give 


Un — 2 
n(n — 1) 


(1.14) 


By putting n = 2 in (1.14) we obtain 


a 2 = - 


a o 

2 - 1 ‘ 


a 3 = - 


Qi 

3 -2' 


For n = 3 



14 


VARIABLE COEFFICIENT, SECOND ORDER DIFFERENTIAL EQUATIONS 


For n = 4, 


a-i 

04 = T3- 

and substituting for 02 in terms of ao gives 

1 _ / «o N _ Qp _ ao 

° 4 ~ 4 • 3 V 2-1/ _ 4 • 3 • 2 • 1 ~ 4! ’ 

Similarly for n = 5, using the expression for 03 in terms of a\, 

a 3 _ If ai \ _ ai 
“ 5 - 3^2J ~5\- 

A pattern is emerging here and we propose that 

a 2 „ = (-ir 7 ^ 1 , a 2n+1 = (-1)"-^— . (1.15) 

( 2 n)! ( 2 n + l)! 

This can be proved in a straightforward manner by induction, although we will not 
dwell upon the details here.f 

We can now deduce the full solution. Starting from (1.8), we substitute c = — 
and write out the first few terms in the summation 


y = x 1 / 2 (a 0 + aix + a 2 x 2 + ■■■)■ 


Now, using the forms of the even and odd coefficients given in (1.15), 


y = x - 1 ' 2 


^a 0 + a\X - 


CLqX 2 

2 ! 


aix 3 

~3T 


4 

CLqX 

~4T 



This series splits naturally into two proportional to ao and a±, namely 


y = x 1/2 a 0 



4! 






The solution is therefore 


cos x sin x 

yW = a °^Tj2+ a i^ij2’ 


since we can recognize the Taylor series expansions for sine and cosine. 

This particular differential equation is an example of the use of the method of 
Frobenius, formalized by 


Frobenius General Rule I 

If the indicial equation has two distinct roots, 
c = a, (3 (a < 0), whose difference is an in- 
teger, and one of the coefficients of x k becomes 
indeterminate on putting c = a, both solutions 
can be generated by putting c = a in the recur- 
rence relation. 


f In the usual way, we must show that (1.15) is true for n = 0 and that, when the value of a 2 n+l 
is substituted into the recurrence relation, we obtain tt2(n+l)+l> as given by substituting n + 1 
for n in (1.15). 
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In the above example the indicial equation was c 2 — | = 0, which has solutions c = 
±|, whose difference is an integer. The coefficient of x c+l was ai {(c+ l) 2 — |} = 
0. When we choose the lower of the two values (c = — |) this expression does not 
give us any information about the constant a\, in other words ai is indeterminate. 


1.3.2 The Roots of the Indicial Equation Differ by a Noninteger 
Quantity 

We now consider the differential equation 

M 1 ~ x )^ + 0 - x )% + 3y = °- ( L16 ) 

As before, let’s assume that the solution can be written as the power series (1.8). 
As in the previous example, this can be differentiated and substituted into the 
equation to yield 

OO OO 

2x(l — x)YM* + c)(n + c — l)x n+c ~ 2 + (1 — x)^^a n (n + c)x n+c ~ 1 

n = 0 n= 0 

oo 

+3 ’^^a n x n+c = 0 . 

n— 0 

The various terms can be multiplied out, which gives us 

OO OO 

a n (n + c)(n + c — l)2x n+c ~ 1 — ^a„(n + c)(n + c — l)2x n+c 

71=0 71=0 

OO oo oo 

+^2 a n{n + c)x n+c ^ 1 - ^a„(n + c)x n+c + 3^a n a; ri+c = 0. 

71 = 0 71 = 0 71 = 0 

Collecting similar terms gives 

OO 

^a n {2(n + c)(n + c — 1) + (n + c)}x n+c ~ l 

71=0 


+^^a n {3 — 2 (n + c)(n + c — 1) — (n + c)}x n+c = 0, 

71=0 

and hence 

OO OO 

+ c)(2n + 2c — l)a;” +c_1 + ^a n {3 — (n + c)(2 n + 2c— l)}x n+c = 0. 

71=0 71=0 

We now extract the first term from the left hand summation so that both summa- 
tions start with a term proportional to x c . This gives 

OO 

a 0 c(2c — l);r c_1 + ^a„(n + c)(2n + 2c — l)a;" +c_1 

71=1 
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+y^a ra {3 — (n + c)(2n + 2 c— l)}x n+c = 0. 

n— 0 

We now let m = n + 1 in the second summation, which then becomes 

OO 

a m - 1{3 — (m — 1 + c)(2(m — 1) + 2c — l)}x m+c_1 . 

m=l 

We again note that m is merely a dummy variable which for ease we rewrite as n, 
which gives 

OO 

aoc(2c — l)x c_1 + a n (n + c)(2n + 2c — l)a: n+c ^ 1 

n— 1 


{3 — (n — 1 + c)(2n + 2c — 3)} a;" +c 1 = 0. 

n—1 

Finally, we can combine the two summations to give 

aoc(2c — l)a; c_1 


OO 

+y^{a n (?r + c)(2n + 2c — 1) + a n _i{3 — (n — 1 + c)(2?r + 2c — 3)}}a; rl+c_1 = 0. 

71=1 


As in the previous example we can now consider the coefficients of successive 
powers of x. We start with the coefficient of a: c_1 , which gives the indicial equation, 
aoc(2c — 1) = 0. Since ao ^ 0, this implies that c = 0 or c = Notice that these 
roots do not differ by an integer. The general term in the summation shows that 


(n + c— l)(2n + 2c— 3) — 3'i . . 

(„ + c)(2„ + 2c-l) !• f ° rn=I ' 2 (117) 


We now need to solve this recurrence relation, considering each root of the indicial 
equation separately. 


Case I: c = 0 

In this case, we can rewrite the recurrence relation (1.17) as 

f (n — 1) (2 n — 3) — 3 1 f 2 n 2 — 5n 1 / 2n — 5 

Qn = °"" 1 1 n(2n — 1 ) / = ° n - 1 \n(2n- 1 ) / = ° n " 1 V^l 

We recall that this holds for n ^ 1, so we start with n = 1, which yields 



For n = 2 
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where we have used the expression for cq in terms of a 0 . Now progressing to n = 3, 
we have 


03 = 02 G) = “”5Y 

and for n = 4, 

04 = 03 (0 = ““tT- 

Finally, for n = 5 we have 

«5 = «4 (|) = 

In general, 

_ 3 a 0 

a " “ (2n — l)(2n — 3) ’ 

which again can be proved by induction. We conclude that one solution of the 
differential equation is 


y 


— x ^2a n x n — x°^2 _ 


3a ( 


o 


n — 0 


n ^(2n-l)(2n-3)‘ 


This can be tidied up by putting 3ao = A, so that the solution is 

°° x n 


y 




n — 0 


(2n — l)(2n — 3) ’ 


(1.18) 


Note that there is no obvious way of writing this solution in terms of elementary 
functions. In addition, a simple application of the ratio test shows that this power 
series is only convergent for |cc| ^ 1, for reasons that we discuss below. 

A simple MATLABf function that evaluates (1-18) is 


function frob = frob(x) 
n = 100 : -1 : 0; a = 1 . /(2*n-l) . /(2*n-3) ; 
frob = polyval(a,x) ; 


J 


which sums the first 100 terms of the series. The function polyval evaluates the 
polynomial formed by the first 100 terms in the sum (1.18) in an efficient manner. 
Figure 1.2 can then be produced using the command ezplot (Of rob, [-1 , 1] ) . 

Although we could now use the method of reduction of order, since we have 
constructed a solution, this would be very complicated. It is easier to consider the 
second root of the indicial equation. 


f See Appendix 7 for a short introduction to MATLAB. 
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Fig. 1.2. The solution of (1.16) given by (1.18). 


Case II: c = ^ 

In this case, we simplify the recurrence relation (1.17) to give 


CL n — tin— 1 


(n - |) (2 n - 2) — 3 
(n+ 1) 2 n 


— ®n— 1 


2 77T — 3n — 2 
2n 2 + n 


f (2n + l)(n — 2) ) /n — 2 

= Art— 1 ^ 771 TTV I = a n— 1 I 

( n(2 n+l) J \ n 

We again recall that this relation holds for n ^ 1 and start with n = 1, which gives 
ai = ao(— 1). Substituting n = 2 gives ci 2 = 0 and, since all successive o, will be 
written in terms of 02, 07 = 0 for 1 = 2,3,... . The second solution of the equation 
is therefore y = Bx 1 / 2 { 1 — x). We can now use this simple solution in the reduction 
of order formula, (1.3), to determine an analytical formula for the first solution, 
(1.18). For example, for 0 ^ x ^ 1, we find that (1.18) can be written as 

3a; - 2 + 3a; 1/2 (1 - x) log j 1 + * 1/2 ) • 

1 1 1 - z) J. 

This expression has a logarithmic singularity in its derivative at x = 1, which 
explains why the radius of convergence of the power series solution (1.18) is \x\ ^ 1. 

This differential equation is an example of the second major case of the method 
of Frobenius, formalized by 


y = --A 
y 6 
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Frobenius General Rule II 

If the indicial equation has two distinct roots, 
c = a,0 (a < 0), whose difference is not an 
integer, the general solution of the equation is 
found by successively substituting c = a then c = 
0 into the general recurrence relation. 


1.3.3 The Roots of the Indicial Equation are Equal 

Let’s try to determine the two solutions of the differential equation 

d 2 y , .dy 
X dx^ + {1 + X) dx +2y = °- 

We substitute in the standard power series, (1.8), which gives 

OO OO 

x^^a n (n + c)(n + c — l)x n+c ~ 2 + (1 + x)^^a n (n + 

n—0 n—0 


+2 ^a n x n+c = 0 . 

n— 0 

This can be simplified to give 

OO OO 

+ c) 2 a; ri+c ^ 1 + ^a„(n + c + 2)x n+c = 0. 

n—0 n=0 

We can extract the first term from the left hand summation to give 

OO OO 

aoc 2 x c ~ x + ^a„(n + c) 2 a;" +c_1 + ^a„(n + c + 2)x n+c = 0. 

n= 1 n= 0 

Now shifting the series using m = n + 1 (and subsequently changing dummy vari- 
ables from m to n) we have 

OO 

a 0 c 2 ai c_1 + ^{a„(n + c) 2 + a n _i(?i + c + l)}x n+c = 0, (1.19) 

n — 1 

where we have combined the two summations. The indicial equation is c 2 = 0 
which has a double root at c = 0. We know that there must be two solutions, but 
it appears that there is only one available to us. For the moment let’s see how far 
we can get by setting c = 0. The recurrence relation is then 

n+1 

o„ = -a„_i — for n = 1,2, . . . . 
n z 


When n = 1 we find that 


2 

ai — — a 0p’ 
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and with n = 2, 


Using n = 3 gives 


and we conclude that 


3 _ 3-2 

a 2 - -a 1^2 - a ° l2T22 - 

4 4-3-2 

° 3 — ~ ° 2 32 - ~ a ° l 2 • 2 2 • 3 2 ’ 


„ __ f -I \ra ( n !)• __ / 1 \I1 n + ^ 

l -L) / |\9 a 0 l *-) I a 0 . 


One solution is therefore 


( n !) 5 


\ ' , > n (n + 1) 

2/ = ao2^(- 1 ) t — 

n — 0 


>.! 


which can also be written as 


y = a 0 <x 


E 


(-i) ? 


E 


(■ -l) n x n 
n\ 


— ' (n — 1)! 

=1 v 7 n=0 

— ao(l — x)e~ 


I ( I \m rr m 


^ m=0 J 

This solution is one that we could not have readily determined simply by inspection. 
We could now use the method of reduction of order to find the second solution, but 
we will proceed with the method of Frobenius so that we can see how it works in 
this case. 

Consider (1.19), which we write out more fully as 

d 2 V , , ,dy 

*dp + (1 + x) di + 2v = 

oo 

aoC 2 x c_1 + ^{a„(n + c) 2 + a n _i(n + c + l)}x n+c = 0. 

n= 1 

The best we can do at this stage is to set a„(n + c) 2 + a„_i(n+c+l) = 0 for n ^ 1, 
as this gets rid of most of the terms. This gives us a n as a function of c for n ^ 1, 
and leaves us with 


x-i-j + (1 + x)^r- + 2y = a 0 c 2 x c 1 . 
ax z ax 


(1.20) 


Let’s now take a partial derivative with respect to c, where we regard y as a function 
of both x and c, making use of 

d d d / dy\ d f dy' 
dx dx' dc\dx) dx\dc. 


This gives 


d 2 ( dy 


d ( dy 


a + < I+a % I 


dy 


dc 


d 


dc 
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Notice that we have used the fact that a 0 is independent of c. We need to be careful 
when evaluating the right hand side of this expression. Differentiating using the 
product rule we have 

|(cV- 1 ) = c^(^- 1 ) + ^- 1 |(c 2 ). 

We rewrite x c ~ x as x c x _1 = e clogx x~ 1 , so that we have 

^(cV- 1 ) = c 2 ^(e clogx x~ 1 ) + x c_1 ^(c 2 ). 

Differentiating the exponential gives 

^(cV- 1 ) = c 2 (loga:e cloga: )a; _1 + £ C-I 2c, 
ac 

which we can tidy up to give 

c V" 1 ) = c 2 x c ~ x log a; + x c ~ 1 2c. 
ac 

Substituting this form back into the differential equation gives 


d 2 (dy 


i? I SJ + (I + l S ' + ^ + 


d ( dy 


dc 


Now letting c — > 0 gives 

d 2 dy 
X dx 2 dc 


c=0 


, d dy 

+ {l + x) dx dc 


c=0 


dy 

+ 2 / 

ac 


= 0. 


c—0 


Notice that this procedure only works because (1.20) has a repeated root at c = 0. 


is a second solution of our ordinary differential equation. 


c=0 


dy 

We conclude that — 
ac 

To construct this solution, we differentiate the power series (1.8) (carefully!) to 
give 

OO j OQ 

^ x ° x > 


dc 


n — 0 


n = 0 


using a similar technique as before to deal with the differentiation of x c with respect 
to c. Note that, although ao is not a function of c, the other coefficients are. Putting 
c = 0 gives 


dy 

dc 


c=0 


E da n 

dc 


n—0 


We therefore need to determine 
which is 


da n 

dc 


c n Tioga; ^ a n | c=0 x". 

n—0 

. We begin with the recurrence relation, 

q n _i(n + c+ 1) 

(n + c) 2 


c= o 


c=0 



22 


VARIABLE COEFFICIENT, SECOND ORDER DIFFERENTIAL EQUATIONS 


Starting with n = 1 we find that 


whilst for n = 2, 


a i 


— ao(c + 2) 
(c+1) 2 


a 2 = 


— ai(c + 3) 


(c+2) 2 ’ 

and substituting for a\ in terms of clq gives us 

a 0 (c T 2) (c -|- 3) 


a 2 = 


(c + l) 2 (c + 2) 2 ' 

This process can be continued to give 

(c + 2) (c + 3) . . . (c + n + 1) 


= (— l)" a o 


(c + l) 2 (c + 2)2 . . . (c + n) 2 


which we can write as 

a 

We now take the logarithm of this expression, recalling that the logarithm of a 
product is the sum of the terms, which leads to 


= (-l) n a 0 


n;=i(c+j + i) 

{n; = 1 ( C +j )} 2 


log (an) = log((-l)"a 0 ) + log JJ(c + j + 1) -2 log JJ(c + j) 


V=i 


t,i =1 


= log((-l) n ao) + Y lo g( c + J + 1 )~ 2 Y lo g( c + •?')• 

3=1 3=1 

Now differentiating with respect to c gives 

1 da n __ 1 ^ 1 

a n dc ~[ c + J + l c + 3 ’ 


and setting c = 0 we have 

1 da n 
a„ dc 


'll 1 'll 1 


3=1 


3=1 


Since we know a n when c = 0, we can write 


da n 

dc 




c= 0 


« .E — -2Vi 

(TiUi) W i + 1 ^ 

(n !) 2 j ) 


In this expression, we have manipulated the products and written them as facto- 
rials, changed the first summation and removed the extra term that this incurs. 
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Simplifying, we obtain 

= (-l)"a 0 (7 ^ (<Kn + 1) - 2<j>(ri) - 1) , 
c =o n! 

where we have introduced the notation 


da n 

dc 


< h «) -^2 —■ 

TO 


( 1 . 21 ) 


The second solution is therefore 


y = ao 


{<j) (n + 1) - 2cj)(n) - \}x n + ^x n logx 

i n! ' n! 

_n— 0 n=0 


This methodology is formalized in 

Frobenius General Rule III 

If the indicial equation has a double root, c = a, 
one solution is obtained by putting c = a into the 
recurrence relation. 

The second independent solution is (dy/dc) c=a 
where a n = a n (c) for the calculation. 

There are several other cases that can occur in the method of Frobenius, which, 
due to their complexity, we will not go into here. One method of dealing with these 
is to notice that the method outlined in this chapter always produces a solution of 
the form y = Ui(x) = cLnX n+c - This can be used in the reduction of order 

formula, (1.3), to find the second linearly independent solution. Of course, it is 
rather difficult to get all of U 2 (x) this way, but the first few terms are usually easy 
enough to compute by expanding for small x. Having got these, we can assume a 
general series form for u 2 (a:), and find the coefficients in the usual way. 


Let’s try to solve 


Example 

x 2 y" + xy' + (x 2 — 1 )y = 0, 


(1.22) 


using the method of Frobenius. If we look for a solution in the usual form, y = 
J2n Lo a nX n+c 1 we find that 


a 0 (c 2 - l)x c + ai {(1 + c) 2 - 1} x c+1 + ^ [a k {(k + c) 2 - 1} + o fc _ 2 ] x k+c = 0. 

k—2 

The indicial equation has roots c = ±1, and, by choosing either of these, we find 
that a\ = 0. If we now look for the general solution of 

Ofe- 2 

Gk ~ (k + c) 2 - 1’ 

we find that 


Qq ao 

(2 + c) 2 -l~ (1 + c)(3 + c) ’ 


02 = - 
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Q>4 = 


a 2 


a 0 


(4 + c) 2 — 1 (l + c)(3 + c) 2 (5 + c)’ 
and so on. This gives us a solution of the form 

y(x, c) = a 0 x c < 1 - 


(l + c)(3 + c) (l + c)(3 + c) 2 (5 + c) 

Now, by choosing c = 1, we obtain one of the linearly independent solutions of 

(1.22), 


ui(x) = y(x, 1) = x 1 - — - + 


2-4 2 • 4 2 • 6 

However, if c = — 1, the coefficients a 2n for n = 1,2,... are singular. 

In order to find the structure of the second linearly independent solution, we use 
the reduction of order formula, (1.3). Substituting for u\(x) gives 


u 2 {x) = x ( 1 - — + 


1 


f 2 1 - ^ 


i i x 

= *u- T ' 

, . a : 2 

= X ( 1 — — 


e ( 1 + 4 + ' 

-h + l logx 


f 1 

exp ( — / - ds ) dt 


- dt 
t 


The second linearly independent solution of (1.22) therefore has the structure 

u 2 (x) = \u 1 {x)\ogx - -^-v{x), 

4 2x 

where v(x) = 1 + b 2 x 2 + b^x 4 + ■ ■ ■ . If we assume a solution structure of this form 
and substitute it into (1.22), it is straightforward to pick off the coefficients b 2n . 

Finally, note that we showed in Section 1.2.1 that the Wronskian of (1.22) is 
W = A/x for some constant A. Now, since we know that U\ = x + ■ ■ ■ and 

u 2 = — l/2a; + - • • , we must have W = x(l/2a; 2 ) + l/2a; + - • • = 1/aH , and hence 

A = 1. 


1.3.4 Singular Points of Differential Equations 

In this section, we give some definitions and a statement of a theorem that tells 
us when the method of Frobenius can be used, and for what values of x the infinite 
series will converge. We consider a second order, variable coefficient differential 
equation of the form 

P(x) + Q(x) + R{x)y = 0. (1.23) 

Before we proceed, we need to further refine our ideas about singular points. If xq 
is a singular point and (x — Xq )Q(x)/P(x) and (x — Xq ) 2 R(x)/P(x) have convergent 
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Taylor series expansions about Xq, then x = Xq is called a regular singular point; 
otherwise, Xq is called an irregular singular point. 


Consider the equation 


Example 


( i\ d2 y L n , \ d y a n 
x(x ~ + l 1 + x )-r + y = °- 

dx z ax 


There are singular points where x 2 — x = 0, at x = 0 and x = 1. Let’s start by 
looking at the singular point x = 0. Consider the expression 


xQ x(l + x) (1 + x) 


= -(1 + x)(l - x)~ 


P x 2 ^x ( x-1 ) v 

Upon expanding (1 — a;) -1 using the binomial expansion we have 
^ = -(1 + a;)(l + x + x 2 H V x n -\ ), 


which can be multiplied out to give 

xQ 


— — 1 — 2x T • • ' 


This power series is convergent provided |cc| < 1 (by considering the binomial 
expansion used above). Now 


x 2 R x 2 x 

P x 2 — x x — 1 


= — a;(l — x)~ 


Again, using the binomial expansion, which is convergent provided |cc| < 1, 


— — = — x(l + X + x 2 + ■ ■ ■ ) = —x — x 2 — x 3 — ■ ■ ■ . 

Since xQ/P and x 2 R/P have convergent Taylor series about x = 0, this is a regular 
singular point. 

Now consider the other singular point, x = 1. We note that 

, . Q __ (x — l)(l + x) _ (x — l)(l + x) _ 1+x 

P [x 2 — x) x{x — 1) X 

At this point we need to recall that we want information near x = 1, so we rewrite 
x as 1 — (1 — x), and hence 

c™ I s ) — = 2 ~ ^ ~ 

Expanding in powers of (1 — x) using the binomial expansion gives 


(x-l)J = 2(l- 


{1 + (1 — x) + (1 — x) 2 + •••}, 
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which is a power series in ( x — 1) that is convergent provided \x — 1| < 1. We also 
need to consider 

. , 2 P ( x ~ l) 2 ( x ~ l) 2 x—1 x — 1 

X P x 2 — x x(x—l) x {1 — (1 — x)} 

= (x - 1){1 - (1 - a;)} -1 = (x— 1){1 + (a: - 1) + (x — l) 2 H }, 

which is again convergent provided \x— 1| < 1. Therefore x = 1 is a regular singular 
point. 

Theorem 1.3 If x o is an ordinary point of the ordinary differential equation (1.23), 
then there exists a unique series solution in the neighbourhood of xq which converges 
for \x — Xq\ < p, where p is the smaller of the radii of convergence of the series for 
Q(x)/P(x) and R(x)/P(x). 

If Xq is a regular singular point, then there exists a unique series solution in the 
neighbourhood of x o, which converges for \x — Xo\ < p, where p is the smaller of the 
radii of convergence of the series (x — xo)Q(x)/P(x) and (x — Xq) 2 R(x)/P(x). 

Proof This can be found in Kreider, Kuller, Ostberg and Perkins (1966) and is due 
to Fuchs. We give an outline of why the result should hold in Section 1.3.5. We 
are more concerned here with using the result to tell us when a series solution will 
converge. □ 


Example 

Consider the differential equation (1.24). We have already seen that x = 0 is a 
regular singular point. The radii of convergence of xQ(x) / P(x) and x 2 R(x)/P(x) 
are both unity, and hence the series solution Y^Lo a n.x n+c exists, is unique, and 
will converge for \x\ < 1. 


Example 

Consider the differential equation 


x 


4 


d 2 y 

dx 2 


dff 

dx 


+ y = o. 


For this equation, x = 0 is a singular point but it is not regular. The series solution 
a nX n+c is not guaranteed to exist, since xQ/P = —1/a; 3 , which cannot be 
expanded about x = 0. 


1.3.5 An outline proof of Theorem 1.3 

We will now give a sketch of why Theorem 1.3 holds. Consider the equation 
P{x)y" + Q(x)y' + R(x)y = 0. 

When x = 0 is an ordinary point, assuming that 

P(x) = Pq + xP\ + • • • , Q(%) = Qo P xQi H - * * ■ j R(x) = Rq + xR\ + • • • , 
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we can look for a solution using the method of Frobenius. When the terms are 
ordered, we find that the first two terms in the expansion are 

P 0 o 0 c(c - l)£ c-2 + {P 0 aic(c + 1) + Piaic(c - 1) + Qoaic} x c ~ l + • • • = 0. 

The indicial equation, c(c — 1) = 0, has two distinct roots that differ by an integer 
and, following Frobenius General Rule I, we can choose c = 0 and find a solution 
of the form y = aoyo(x) + aiyi(x). 

When x = 0 is a singular point, the simplest case to consider is when 

P{x) = Pix + P 2 X 2 H . 

We can then ask what form Q(x) and R(x) must take to ensure that a series solution 
exists. When x = 0 is an ordinary point, the indicial equation was formed from the 
y" term in the equation alone. Let’s now try to include the y' and y terms as well, 
by making the assumption that 

R 

Q(x) = Qo + Qix + • • • , R(x) = b Rq + • • • . 

X 

Then, after substitution of the Frobenius series into the equation, the coefficient of 
£ c-1 gives the indicial equation as 

Pic(c — 1) + Q 0 c + R—i = 0. 

This is a quadratic equation in c, with the usual possibilities for its types of roots. 
As x — > 0 

xQ{x) Qq x 2 R[x) R - 1 
P{x) PT’ P{x) * “pT’ 

so that both of these quantities have a Taylor series expansion about x = 0. This 
makes it clear that, when P(x) = P\X + • • • , these choices of expressions for the 
behaviour of Q{x) and R{x) close to x = 0 are what is required to make it a regular 
singular point. That the series converges, as claimed by the theorem, is most easily 
shown using the theory of complex variables; this is done in Section A6.5. 


1.3.6 The point at infinity 

Our discussion of singular points can be extended to include the point at infinity 
by defining s = 1/x and considering the properties of the point s = 0. In particular, 
after using 

dy _ _ 2 dy d^y _ 2 _d / 2 dy\ 
dx ds ’ dx 2 ds \ ds ) 

in (1.23), we find that 

+ Q( s )j- + R( s )y = °> 

as z as 

P(s) = s A p(Rj, Q(s) = 2s 3 p(Rj -s 2 Q(Rj , R(s) = R(Rj. 


where 
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For example, Bessel’s equation, which we will study in Chapter 3, has P(x) = x 2 , 
Q(x) = x and R{x) = x 2 — v 2 , where v is a constant, and hence has a regular 
singular point at x = 0. In this case, P(s) = s 2 , Q(s ) = s and R(s) = 1/s 2 — v 2 . 
Since P( 0) = 0, Bessel’s equation also has a singular point at infinity. In addition, 
s 2 R(s)/P(s ) = (1 - s 2 v) / s, which is singular at s = 0, and we conclude that 
the point at infinity is an irregular singular point. We will study the behaviour 
of solutions of Bessel’s equation in the neighbourhood of the point at infinity in 
Section 12.2.7. 


Exercises 
1.1 


1.2 


1.4 


Show that the functions U\ are solutions of the differential equations given 
below. Use the reduction of order method to find the second independent 
solution, m 2 . 

(a) u\ — e x , (x - 1) -xj- + y = 0, 

dx z ax 

. , . d 2 y „ dy 

(b) u\ = x smi, x— -rr + 2-/- + xy = 0. 

dx z dx 


Find the Wronskian of 

(a) x, x 2 , 

(b) e x , e- x , 

(c) a:cos(log |x|), xsin(log |x|). 

Which of these pairs of functions are linearly independent on the interval 

[- 1 , 1 ]? 

1.3 Find the general solution of 

(a) f|-2 d / + y = ^e-, 

d 2 y 


(b) '—^r + Ay = 2 sec 2x, 
dx z 


(c) 


d 2 y 

dx 2 

d 2 y 


Idy 
x dx 


1 - 


4x 2 


y = x, 


(d) + y = f(x), subject to y( 0) = y'( 0) = 0. 

If Mi and M 2 are linearly independent solutions of 

y" + P{x)y' + q(x)y = 0 

and y is any other solution, show that the Wronskian of {y, Mi, M 2 }, 


W(x) = 


y 

V' 

y" 


Ml 

U 2 

„ ./ 

„ ,/ 

Mi 

m 2 

„.rr 


Mi 

m 2 


is zero everywhere. Hence find a second order differential equation which 
has solutions y = x and y = log x. 
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1.5 

1.6 


1.7 


Find the Wronskian of the solutions of the differential equation 

(1— x 2 )y" — 2xy'+2y — 0 to within a constant. Use the method of Frobenius 

to determine this constant. 

Find the two linearly independent solutions of each of the differential equa- 
tions 


(a) 


, d 2 y 

dx 2 

' 2 y 


X 2 


dy 

dx 


+ x</ = °, 


dy 


using the method of Frobenius. 
Show that the indicial equation for 


x(l 



+ (1 



dy = 0 


has a double root. Obtain one series solution of the equation in the form 


y = A'^^n 2 x n 1 = Aui(x). 

n = 1 

What is the radius of convergence of this series? Obtain the second solution 
of the equation in the form 


1.8 


1.9 


1.10 


U 2 (x) = U\{x) logx + Ui(x) (— dx +•••). 


(a) The points x = ±1 are singular points of the differential equation 

Show that one of them is a regular singular point and that the other 
is an irregular singular point. 

(b) Find two linearly independent Frobenius solutions of 


d 2 y , .dy 


~xy = 0, 


which are valid for x > 0. 

Find the general solution of the differential equation 


2x 


d 2 y 

dx 2 


+ (1 + x) 


dy 

dx 


ky = 0 


(where k is a real constant) in power series form. For which values of k is 
there a polynomial solution? 

Let a, (3, 7 denote real numbers and consider the hypergeometric equation 


x{\ 



+ {7 - (a + (3 + d)x} 


dy_ 

dx 


a/3y = 0. 


Show that x = 0 and x = 1 are regular singular points and obtain the roots 
of the indicial equation at the point x = 0. Show that if 7 is not an integer, 
there are two linearly independent solutions of Frobenius form. Express a\ 
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and 02 in terms of a 0 for each of the solutions. 


1.11 


1.12 


1.13 


Show that each of the equations 


,d“y 


dy 


( a ) x ZTi+x "T — h y = 0, 


dx 2 

,, . nd 2 h 
(b) 1 *? + 


dx 

^- 2 » = 0, 

dx 


has an irregular singular point at x = 0. Show that equation (a) has no 
solution of Frobenius type but that equation (b) does. Obtain this solution 
and hence find the general solution of equation (b) . 

Show that x = 0 is a regular singular point of the differential equation 


o 2 d 2 V , n x dy 
2x d^ +x( 1 - x) te- y = a - 

Find two linearly independent Frobenius solutions and show that one of 
these solutions can be expressed in terms of elementary functions. Verify 
directly that this function satisfies the differential equation. 

Find the two linearly independent solutions of the ordinary differential 
equation 


x(x — 1 )y" + 3 xy' + y = 0 


in the form of a power series. Hint: It is straightforward to find one solution, 
but you will need to use the reduction of order formula to determine the 
structure of the second solution. 

1.14 * Show that, if f(x) and g(x) are nontrivial solutions of the differential 

equations u" + p(x)u = 0 and v" + q(x)v = 0, and p(x) ^ q(x), f{x) 
vanishes at least once between any two zeros of g{x) unless p = q and 
/ = (this is known as the Sturm comparison theorem) . 

1.15 * Show that, if q(x) ^ 0, no nontrivial solution of u" + q{x)u = 0 can have 
more than one zero. 



CHAPTER TWO 


Legendre Functions 


Legendre’s equation occurs in many areas of applied mathematics, physics and 
chemistry in physical situations with a spherical geometry. We are now in a position 
to examine Legendre’s equation in some detail using the ideas that we developed 
in the previous chapter. The Legendre functions are the solutions of Legendre’s 
equation, a second order linear differential equation with variable coefficients. This 
equation was introduced by Legendre in the late 18th century, and takes the form 

(1_a;2) l^ _2;C S +n(n + 1)2/ = 0, t 2 ' 1 ) 

where n is a parameter, called the order of the equation. The equation is usually 
defined for — 1 < x < 1 for reasons that will become clear in Section 2.6. We can see 
immediately that x = 0 is an ordinary point of the equation, and, by Theorem 1.3, 
a series solution will be convergent for \x\ < 1. 

2.1 Definition of the Legendre Polynomials, P n {x) 

We will use the method of Frobenius, and seek a power series solution of (2.1) in 
the form 

OO 

y = ^2 a i x%+ °- 

i = 0 

Substitution of this series into (2.1) leads to 

OO OO 

(1 — a; 2 ) ^ a,i(i + c)(i + c — l)x z+c ~ 2 — 2 x^^ai(i + c)H +0 ~ 1 

i — 0 i = 0 

oo 

+n(n + 1 ) aiX l+c = 0 . 

i=0 

Tidying this up gives 

OO OO 

y + c)(i + c - l)a;* +c_2 + ^ ai{n{n + 1) — (i + c)(i + c + l)}x l+c = 0, 

i— 0 i = 0 

and collecting like powers of x we get 

OO 

aoc(c — l)x c ~ 2 + aic(c + l)x c_1 + ^ ai(i + c)(i + c — l)x 1+c ~ 2 

i = 2 
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oo 

+ a,i{n{n + 1) — (i + c)(i + c + l)}x l+c = 0. 

i—0 

Rearranging the series gives us 

a 0 c(c — l)x c ~ 2 + a\c(c + l)x c_1 


+ + c)(i + c — 1) + aj_ 2 (n(n + 1) — (i + c — 2 ){i + c — l))}x %+c 2 = 0. 

»= 2 

The indicial equation is therefore 


c(c — 1) = 0, 


which gives c = 0 or 1. Following Frobenius General Rule I, we choose c = 0, so 
that a± is arbitrary and acts as a second constant. In general we have, 


For i = 2, 


(i~l)(i-2)-n(n + l) coo 

at = — — a*_ 2 for i = 2,3,... . 

H* - !) 


a 2 = 


—n(n + 1) 


-a 0 . 


For i = 3, 


For i = 4, 


For i = 5, 


{2 • 1 — n(n +1)} 

° 3 = 3^2 ai ' 


{3 • 2 — n(n + 1)} —{3 • 2 — n(n + 1 )}n(n + 1) 

a 4 = : — o a 2 = 71 a 0 . 


4-3 


4! 


{4 • 3 — n(n + 1)} {4 • 3 — n(n + 1)}{2 • 1 — n(n + 1)} 

° 5 = 5^4 ° 3 = 5! ai ' 

The solution of Legendre’s equation is therefore 


y = a 0 


1 n(n+l)^ 2 {3 • 2 — n(n + l)}n(n + 1) 


2 ! 


4! 


+01 


{2 • 1 — n(n + 1)} 3 {4 • 3 — n(n + 1) } (2 • 1 — n(n + 1)) 


3! 


5! 


(2.2) 


where ao and a\ are arbitrary constants. If n is not a positive integer, we have two 
infinite series solutions, convergent for \x\ <1. If n is a positive integer, one of the 
infinite series terminates to give a simple polynomial solution. 

If we write the solution (2.2) as y = aoU n (x) + a iv n (x), when n is an integer, 
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then Uo(x), V\(x), u 2 (x ) and v 3 (x) are the first four polynomial solutions. It is 
convenient to write the solution in the form 


y = AP n ( x) 

t 

Polynomial 
of degree n 


PQn(x') 5 

T 

Infinite series, 
converges for \x\ < 1 


where we have defined P n (x) = u n (x)/u n (l) for n even and P n (x) = v n (x)/v n (l) 
for n odd. The polynomials P n (x) are called the Legendre polynomials and can 
be written as 


Pn(x) = yx-iy 


— ?rVx n - 2r 


(2 n — 2 r)\x 


r=0 


2"r!(n — r)\(n — 2 r)! ’ 


where m is the integer part of n/2. Note that by definition P n (l) = 1. The first 
five of these are 

Po(x) = 1, P\{x) = x, P 2 {x) = t;(3x 2 - 1), 


p 3 {x ) = ^(5x 3 - 3x), P A {x) = ya; 4 - yX 2 + y 

Graphs of these Legendre polynomials are shown in Figure 2.1, which was generated 
using the MATLAB script 



Note that the MATLAB function legendre (n,x) generates both the Legendre 
polynomial of order n and the associated Legendre functions of orders 1 to n, 
which we will meet later, so we have to pick off the Legendre polynomial as the 
first row of the matrix p. 

Simple expressions for the Q n (x) are available for n = 0, 1,2,3 using the reduc- 
tion of order formula, (1-3). In particular 

Qo(x) = ^ log ’ Qi(x) = lxlog(^±^j-l r 

Qi{x) = ^(3x 2 - l)log - \x, Q 3 (x ) = i(5x 3 -3x)log - \ x2 + 

These functions are singular at x = ±1. Notice that part of the infinite series has 


CO| to 
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0.8 

0.6 

0.4 

0.2 

0 

- 0.2 

- 0.4 

- 0.6 

- 0.8 

-1 

Fig. 2.1. The Legendre polynomials Pi(x), P 2 (x), P 3 (x) and P±(x). 

been summed to give us the logarithmic terms. Graphs of these functions are shown 
in Figure 2.2. 

Example 

Let’s try to find the general solution of 

(1 - x 2 )y" - 2 xy + 2 y= 1 

1 — x z 

for —1 < x < 1. This is just an inhomogeneous version of Legendre’s equation of 
order one. The complementary function is 

y h = AP^x) + BQi(x). 

The variation of parameters formula, (1.6), then shows that the particular integral 
is 

Pi(s)Qi(x) - Pi(a;)Qi(s) 

(l- S 2 ){Pi(s)QUs) _p {(s)Qi(s)} f - s - 

We can considerably simplify this rather complicated looking result. Firstly, Abel’s 
formula, (1.7), shows that the Wronskian is 




W = P 1 ( S )Q , 1 (s)-A'(s)Ci(s) 


W 0 
1 — s 2 
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Fig. 2.2. The first four Legendre functions, Q o(x), Qi(x ), Q 2 ( 2 ) and Q 3 (x). 


We can determine the constant Wq by considering the behaviour of W as s — > 0. 
Since 

Pi(s) = s, Qi(s) = ^slog - 1) 

W = 1 + s 2 + • • • for s <C 1. From the binomial theorem, 1/(1 — s 2 ) = 1 + s 2 + • • • 
for s <C 1. We conclude that Wo = 1. This means that 

/ X fiX 

Pi ( s)ds - Pi (x) I Q i (s) ds 

= '-x^x) - x { J0r 2 - 1) log - 2*} • 

The general solution is this particular integral plus the complementary function 

{y P (x) + Vh{x)). 

2.2 The Generating Function for P n (x ) 

In order to make a more systematic study of the Legendre polynomials, it is helpful 
to introduce a generating function, G(x,t). This function is defined in such a 
way that the coefficients of the Taylor series of G(x,t) around t = 0 are P n (x). 

We start with the assertion that 

OO 

G(x, t) = (1 — 2 xt + t 2 )^ 1 / 2 = y ^P n (x)t n . 

n — 0 


(2.3) 
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Just to motivate this formula, let’s consider the first terms in the Taylor series 
expansion about t = 0. Using the binomial expansion formula, which is convergent 
for 1 t 2 — 2xt\ < 1 (for \x\ < 1 we can ensure that this holds by making |t| small 
enough) , 

{l + (-2 xt + t 2 )}- 1 ' 2 = 

1+ (-2 xt + (-2 xt + t 2 ) 2 + • • • 


— 1 + Xt + — (3x^ — 1 )f 2 + • • • — Pq{x) + P\{x)t + P 2 (x)t 2 + • • • , 

as expected. With a little extra work we can derive (2.3). 

We start by working with 

OO 

(1 - 2xt + t 2 )~ 1/2 = ^Z n {x)t n , (2.4) 

n = 0 


where, using the binomial expansion, we know that Z n {x) is a polynomial of degree 
n. We first differentiate with respect to x, which gives us 

OO 

t( 1 - 2 xt + t 2 )~ 3/2 = ^2 Z' n (x)t n , 

n— 0 


and again gives 

OO 

3t 2 (l - 2 xt + t 2 )~ 5/2 = J2 Z"(x)t n - 

n — 0 


Now we differentiate (2.4) with respect to t, which leads to 

OO 

( x — f)(l — 2 xt + t 2 )~ 3 / 2 = Zn^nt™- 1 

n = 0 


Multiplying this last result by t 2 and differentiating with respect to t gives 

°° f) 

Z n (x)n(n + 1 )t n = — {t 2 (x - t)( 1 - 2 xt + £ 2 ) -3 / 2 }, 

' ot 

n—0 

= t 2 {(x - t)( 1 - 2xt + t 2 )~ 5/2 3(x -t) + ( 1- 2xt + f 2 ) _3/2 - 1} 

+{x — f)(l — 2xt + t 2 )~ 3 ^ 2 2t, 
which we can simplify to get 

(1 - 2 xt + t 2 )~ 3/2 {3t 2 (x - t) 2 ( 1 - 2 xt + f 2 ) -1 - 


= ^22 Z n{x)n{n + l)t". 

n—0 


1 + 2 t(x — t)} 
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Combining all of these results gives 

OO OO OO 

(! ^ X 2 ) Z n(x) tn - 2 x J 2 Z n( X ) tn + J2 n ( n + l ) Z n{x)t n 
n — 0 n— 0 n — 0 

= (1 - x 2 )3t 2 (l - 2xt + t 2 )~ 5/2 - 2xt(l - 2xt + t 2 )~ 3/2 

+(1 — 2 xt + t 2 )~ 3 ^ 2 {3t 2 (x — f) 2 (l — 2 xt + t 2 )^ 1 — 1 + 2 t(x — t)} = 0, 
for any t. Therefore 

(1 — x 2 )Z'^(x) - 2 xZ' n (x) + n(n + 1 )Z n (x) = 0, 

which is just Legendre’s equation. This means that Z n (x) = aP n (x) + (3Q n (x) 
where a,/3 are constants. As Z n (x) is a polynomial of degree n, j3 must be zero. 

Finally, we need to show that Z n (l) = 1. This is done by putting x = 1 in the 
generating function relationship, (2.3), to obtain 

OO 

(1-2 1 + t 2 )~ 1/2 = J2 Z n(l)t n - 

n — 0 

Since 

OO 

( 1-2 1 + t 2 )~ l/ 2 = {(i - 1) 2 }~ 1/2 = (i - ty 1 = 

n — 0 

at least for \t\ < 1, we have Z n ( 1) = 1. Since we know that P n ( 1) = 1, we conclude 
that Z n {x) = P n (x ), as required. 

The generating function, G(x,t) = (1 — 2xt + < 2 ) -1 / 2 , can be used to prove a 
number of interesting properties of Legendre polynomials, as well as some recurrence 
formulae. We will give a few examples of its application. 

Special Values 

The generating function is useful for determining the values of the Legendre poly- 
nomials for certain special values of x. For example, substituting x = —1 in (2.3) 
gives 

OO 

(1 + 2 1 + t 2 y 1/2 = ^ 2 p n {-i)t n . 

n — 0 

By the binomial expansion we have that 

(1 + 2 1 + t 2 y 1/2 = {(i + 1) 2 }~ 1/2 = (i + ty 1 

OO 

= i -t. + t 2 — + (-ty + • • • = ^(-i) n f\ 

n — 0 

We conclude that 

OO OO 

£(-!)"*" = £ p "(-i)* n , 

n = 0 n — 0 

and therefore P n (— 1) = (— l) n for n = 1, 2, . . . . 
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2.3 Differential and Recurrence Relations Between Legendre 
Polynomials 

The generating function can also be used to derive recurrence relations between the 
various P„( x). Starting with (2.3), we differentiate with respect to t, and find that 

OO 

(x — f)(l — 2 xt + f 2 )^ 3 / 2 = y^P w (a;)nf 7t ~ 1 . 

n— 0 

We now multiply through by (1 — 2 xt + t 2 ) to obtain 

OO 

( x — f)(l — 2xt + t 2 ) -1 / 2 = (1 — 2 xt + t 2 )'Y 1 P n { x )nt n ~ 1 , 

n= 0 

which leads to 

OO OO 

x^2 p n(x)t n - Y P n (x)t n+1 

n—0 n—0 

OO OO OO 

= nY 1 Pn(x)t n ~ 1 — 2 xnY^P n (x)t n + nY P n (x)t n+1 . 

n—0 n= 0 n= 0 

Equating coefficients of t n on both sides shows that 

xP n (x) - P n -i(x) = (n+ l)P„+i(x) - 2xnP n {x) + (n - l)P„_i(a;), 
and hence 

(n + l)P n+ i(a;) - (2 n+ 1 )xP n (x) + nP n -i(x) = 0. (2.5) 

This is a recurrence relation between P n+ i(x), P n {x ) and P n -i(x), which can be 
used to compute the polynomials P n { x). Starting with Pq(x) = 1 and Pi (a;) = x, 
we substitute n = 1 into (2.5), which gives 

2P 2 {x) - 3x 2 + 1 = 0, 

and hence 

Pzi.x) = - !)• 

By iterating this procedure, we can generate the Legendre polynomials P n (x) for 
any n. 

In a rather similar manner we can generate a recurrence relation that involves 
the derivatives of the Legendre polynomials. Firstly, we differentiate the generating 
function, (2.3), with respect to x to get 

OO 

t( 1 - 2xt + f 2 )~ 3/2 = Y tnP n{x). 

n—0 

Differentiation of (2.3) with respect to t gives 

OO 

(x — t)( 1 — 2 xt + t 2 )~ 3 / 2 = Y^, nt n ~ 1 P n (x). 

n—0 
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Combining these gives 


Y nt n P n {x ) = (x-t)Y t np h( x ), 


n — 0 n= 0 

and by equating coefficients of t n we obtain the recurrence relation 
nP„(x ) = xP' n (x) - P' n _ i(x). 

An Example From Electrostatics 

In electrostatics, the potential due to a unit point charge at r = r 0 is 

V= * 


|r-r 0 | 

If this unit charge lies on the z-axis, at x = y = 0, z = a, this becomes 

V= 1 

x 2 + y 2 + (z — a) 2 

In terms of spherical polar coordinates, ( r,6,q t>), 

x = r sin 9 cos <f>, y = r sin 9 sin <j>, z = rcos9. 

This means that 

x 2 + y 2 + {z - a) 2 = x 2 + y 2 + z 1 - 2 az + a 2 = r 2 + a 2 - 2 az, 

and hence 


V = 


1 


\Jr 2 + a 2 — 2 ar cos 9 a 


= - ( 1 — 2 cos#- + — 
a a 2 


- 1/2 


As we would expect from the symmetry of the problem, there is no dependence 
upon the azimuthal angle, </>. We can now use the generating function to write this 
as a power series, 


OO 

V= 1 Y P n(cO S 0 ) n n . 

a ' \aJ 


n — 0 


2.4 Rodrigues’ Formula 

There are other methods of generating the Legendre polynomials, and the most 
useful of these is Rodrigues’ formula, 


P n(x) 


1 d n 
2 ”n! dx n 




in. 


p i(x) 


1 d 1 

2 1 1 ! dx 1 


{(x 2 


1)} = X, 


For example, for n = 1 
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whilst for n = 2, 


P 2 (x) 


1 d 2 
2 2 2! dx 2 


{(x 2 


l) 2 } 


1 d 
4 • 2 dx 


{4cc(x 2 


1 )} 



!)■ 


The general proof of this result is by induction on n, which we leave as an exercise. 

Rodrigues’ formula can also be used to develop an integral representation of the 
Legendre polynomials. In order to show how this is done, it is convenient to switch 
from the real variable x to the complex variable z = x + iy. We define the finite 
complex Legendre polynomials in the same way as for a real variable. In particular 
Rodrigues’ formula, 


Pn{z) 


1 d n 
2 n n\ dz n 


i r 


will be useful to us here. Recall from complex variable theory (see Appendix 6) 
that, if f(z) is analytic and single-valued inside and on a simple closed curve C, 




n! 

2ni 


m 


(Z~z) 


n+1 




for n ^ 0, when z is an interior point of C. Now, using Rodrigues’ formula, we have 


Pn{z) 


1 

2 n+1 7Ti 


(g 2 - 1)" 

(£ - z) n+1 


dti, 


(2.6) 


which is known as Schlafli’s representation. The contour C must, of course, 
enclose the point £ = z and be traversed in an anticlockwise sense. To simplify 
matters, we now choose C to be a circle, centred on £ = z with radius | %/z 2 — 1|, 
with z / 1. Putting £ = z + V z 2 — \e lS gives, after some simple manipulation, 

Pn(z) = V— / ( z + \J z 1 — 1 cos d\ dQ. 

2tt J s=0 V / 

This is known as Laplace’s representation. In fact it is also valid when z = 1, 
since 

1 f 2n 

Pn( 1 ) = / IdS = 1 . 

27T J q 

Laplace’s representation is useful, amongst other things, for providing a bound on 
the size of the Legendre polynomials of real argument. For z £ [—1, 1], we can write 
z = cos (j) and use Laplace’s representation to show that 


1 f 2n 

| P„ (cos 0)| < — / | cos0 + * sin0cos 9\ n dO. 

27t J o 

Now, since 


| cos <j) + i sin 0 cos 
we have |P„(cos0)| < 1. 
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2.5 Orthogonality of the Legendre Polynomials 

Legendre polynomials have the very important property of orthogonality on 
[—1,1], that is 

/* 1 o 


I Pri ( X ) Pm, (x)dx — &mm 

I —X 2n+l 


where the Kronecker delta is defined by 

dmn 


1 for m = n, 

0 for m f n. 

To show this, note that if f £ C[— 1, 1], Rodrigues’ formula shows that 

/ i 1 r 1 ft n 

J{x)P n {x)dx = — J ^ f{x) — {(x 2 - 1 Y}dx 


(2.7) 


2 n n\ 


d n ~ 1 1 1 


r 1 r/ n_1 

r !J {x) d^ [{x 1)n}dx 


1 f 1 r/ ra_1 

/ fix)- r {(a; 2 - l)"}dx. 

2 n ?i! y ’ dx n ~ l u ; J 


Repeating this integration by parts (?i — 1) more times gives 


rl f-1)” 

fix)P n ix)dx = — T 
2 ra n! 


(a; 2 — l) n f n \x)dx, 


(2.8) 


where f^ n \x) is the n th derivative of /(a;). This result is interesting in its own 
right, but for the time being consider the case fix) = P m ix) with m < n, so that 

' ix 2 -l) n P^ix)dx. 


p f-l) r 

Pmix)P n ix)dx = 


1-1 


2 n n\ 


1-1 


Since the n th derivative of an m th order polynomial is zero for m < n, we have 

J P m ix)P n ix)dx = 0 for m < n. 

By symmetry, this must also hold for m > n. 

Let’s now consider the case n = m, for which 

/ P n ix)P n ix)dx= ( f-p— ( ix 2 — l) n Pf\x)dx 


2 n n\ 


(- 1 ) 


2 n n 


n fl i J2n 

-/ ix 2 -l) n —~ -f—{ix 2 -l) n }dx. 
\ J_x 2 n n\dx 2nXy ’ 5 


Noting the fact that 


(a; 2 — l) n = a; 2 " + • • • + (— l) n , 
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and hence that 


d 2n 

dx 2n 


l) n 


2n(2n — 1) . . . 3 • 2 • 1 = (2n)!, 


we can see that 



Pn (^) Pfi (x^dx 




(2 n)\dx 


(-m2ny. [' 2 _ 

2 2 "(n!) 2 J-i 


To evaluate the remaining integral, we use a reduction formula to show that 

/V-ir<fc = ?” +2n!( " +1)!(_1) “ 


(2?r + 2)! 


and hence 


J ^P 2 (x)dx = 


2 n + 1 


This completes the derivation of (2.7). There is an easier proof of the first part 
of this, using the idea of a self-adjoint linear operator, which we will discuss in 
Chapter 4. 

We have now shown that the Legendre polynomials are orthogonal on [—1,1], 
It is also possible to show that these polynomials are complete in the function 
space C[— 1,1]. This means that a continuous function can be expanded as a linear 
combination of the Legendre polynomials. The proof of the completeness property 
is rather difficult, and will be omitted here.f What we can present here is the 
procedure for obtaining the coefficients of such an expansion for a given function 
f(x) belonging to C[— 1, 1]. To do this we write 

OO 

f(x) = a 0 P 0 (x) + a\Pi{x) H 1- a n P n (x) H = y^a n P n (x). 

n—0 


Multiplying by P m (x) and integrating over [—1,1] (more precisely, forming the inner 
product with P m (x)) gives 


f{x)Pm{x)dx 


i-i 



OO 

Pm{x) ^2 a n p n (x)dx. 

71=0 


f Just to give a flavour of the completeness proof for the case of Legendre polynomials, we note 
that, because of their polynomial form, we can deduce that any polynomial can be written 
as a linear combination of Legendre polynomials. However, according to a fundamental result 
due to Weierstrass (the Weierstrass polynomial approximation theorem) any function which is 
continuous on some interval can be approximated as closely as we wish by a polynomial. The 
completeness then follows from the application of this theorem. The treatment of completeness 
of other solutions of Sturm-Liouville problems may be more complicated, for example for Bessel 
functions. A complete proof of this can be found in Kreider, Kuller, Ostberg and Perkins (1966). 
We will return to this topic in Chapter 4. 
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Interchanging the order of summation and integration leads to 

/ i ( r 1 

f(x)P m (x)dx = a n \ I P n (x)P m (x)da 

- 1 n—0 V-l 


= E 


2 x _ 2 a„ 

Qn. ~ ~ Omn 


n—0 


1 2m + 1 mn 2m + l’ 


using the orthogonality property, (2.7). This means that 


2m + 1 


f(x)P m (x)dx, 


i-i 


and we write the series as 


f(x) = ^2 ^ 2n ^ 1 / f{x)P n {x)dx P n (x). 


n—0 


This is called a Fourier— Legendre series. 

Let’s consider a couple of examples. Firstly, when /( x) = x 2 , 


— 


2to + 1 f 1 2 


x 2 Pm(x)dx, 


(2.9) 


so that 


a ° = 2 J x ' ldx = 2 


J -l 


12 _ i 
23 " 3’ 


a i = 


< 21 . ) + 1 


v 1 


'-1 


= 0, 


a 2 = 




3a; 5 x 3 
2G5 _ 3^2 


-i i 


Also, (2.8) shows that 
and therefore, 


a m = 0 for m = 3, 4, . . . , 


x 2 = ^-PoOO + |P2(ar). 


3 w 3 

A finite polynomial clearly has a finite Fourier-Legendre series. 
Secondly, consider f(x) = e x . In this case 


2?n +1 f XJ3 
a m = — ^ — J e p rn{x)dx, 


a 0 = 


J e x dx = ^ (e - e *) , 


and hence 
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a\ = — / xe x dx = 3 e 1 . 

2 J - 1 

To proceed with this calculation it is necessary to find a recurrence relation between 
the a n . This is best done by using Rodrigues’ formula, which gives 

a n = (2 n + 1) ~ Qra-ij for n = 2,3, . . . , (2.10) 

from which the values of 04, <25, ... are easily computed. 

We will not examine the convergence of Fourier-Legendre series here as the de- 
tails are rather technical. Instead we content ourselves with a statement that the 
Fourier-Legendre series converges uniformly on any closed subinterval of (—1,1) 
in which / is continuous and differentiable. An extension of this result to the 
space of piecewise continuous functions is that the series converges to the value 
5 {/(zo") + /( x G )} at each point Xo € (—1, 1) where f has a right and left deriva- 
tive. We will prove a related theorem in Chapter 5. 

2.6 Physical Applications of the Legendre Polynomials 

In this section we present some examples of Legendre polynomials as they arise 
in mathematical models of heat conduction and fluid flow in spherical geometries. 
In general, we will encounter the Legendre equation in situations where we have 
to solve partial differential equations containing the Laplacian in spherical polar 
coordinates. 

2.6.1 Heat Conduction 

Let’s derive the equation that governs the evolution of an initial distribution of 
heat in a solid body with temperature T, density p, specific heat capacity c and 
thermal conductivity k. Recall that the specific heat capacity, c, is the amount 
of heat required to raise the temperature of a unit mass of a substance by one 
degree. The thermal conductivity, k, of a body appears in Fourier’s law, which 
states that the heat flux per unit area, per unit time, Q = ( Q x , Q y , Q z ), is related 
to the temperature gradient, VT, by the simple linear relationship Q = —k\7T. If 
we now consider a small element of our solid body at (x, y , z) with sides of length 
8x , Sy and 6z, the temperature change in this element over a time interval St is 
determined by the difference between the amount of heat that flows into the element 
and the amount of heat that flows out, which gives 

pc{T (x,y, z,t + St) — T (x,y, z,t)} SxSySz 
= {Qx (x, y, Z , t) - Q x ( X + St v, y, z, t)} StSySz 
+ {Q y (x,y, z,t ) - Q y {x,y+ Sy, z,t)}6tSxSz (2.11) 


+ {Qz (x, y, z, t ) - Q z (x, y,z + Sz, t)} StSxSy. 
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P 

kg m -3 

k 

J m _1 s _1 K" 1 

c 

J kg- 1 K- 1 

B 

i 

copper 

8920 

385 

386 

1.1 x 10- 4 

water 

1000 

254 

4186 

6.1 x 10- 5 

glass 

2800 

0.8 

840 

3.4 x IQ -7 


Table 2.1. Some typical physical properties of copper, water (at room temperature 

and pressure) and glass. 


Note that a typical term on the right hand side of this, for example, 
{Qx (x, V, z , t) -Q x (x + Sx, y, z , t)} StSySz, 


is the amount of heat crossing the x-orientated faces of the element, each with area 
SySz, during the time interval (t,t + St). Taking the limit St, Sx, Sy, Sz — > 0, we 
obtain 



9Q X , dQ y t dQ z \ ^ 0 
dx + dy + dz J ^ 


Substituting in Fourier’s law, Q = — kVT, gives the diffusion equation, 

dT 

— = KV 2 T, (2.12) 

dt y ’ 

where K = k/ pc is called the thermal diffusivity. Table 2.1 contains the values 

of relevant properties for three everyday materials. 

When the temperature reaches a steady state ( dT/dt = 0), this equation takes 

the simple form 


V 2 T = 0, 


(2.13) 


which is known as Laplace’s equation. It must be solved in conjunction with 
appropriate boundary conditions, which drive the temperature gradients in the 
body. 


Example 

Let’s try to find the steady state temperature distribution in a solid, uniform sphere 
of unit radius, when the surface temperature is held at f(ff) = Tq sin 4 9 in spherical 
polar coordinates, ( r,9,(j ) ). This temperature distribution will satisfy Laplace’s 
equation, (2.13). Since the equation and boundary conditions do not depend upon 
the azimuthal angle, <j>, neither does the solution, and hence Laplace’s equation 
takes the form 

1 d ( 2 dT\ 1 d ( . n dT\ n 
r 2 dr V dr) + r 2 sin 6 89 \ Sm9 09 J ~ ' 


Let’s look for a separable solution, T(r,9) = R(r)Q(9). This gives 


ef (r^) 

dr \ dr J 


R d 
sin 9 d9 



= 0 , 
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and hence 


1 d 

I r 

Rdr ' 


, dR 
dr 


1 


0 sin 9 dd 


ame 7e 


Since the left hand side is a function of r only and the right hand side is a function 
of 9 only, this equality can only be valid if both sides are equal to some constant, 
with 


1 d 

I r 

R dr 


,dR 

dr 


1 


0 sin 9 d9 


sin 9 


dQ 

d0 


= constant = n(n + 1). 


This choice of constant may seem rather ad hoc at this point, but all will become 
clear later. 

We now have an ordinary differential equation for 0, namely 


1 d 
sin 9 d9 


sin 9 ^ ) + n(n + 1)0 = 0. 
d9 1 


Changing variables to 
and using 


sm 


M = cos 9, y(n)=Q(9), 


2 d d9 d —Id 

^ r* ! i / ui •„ n jn 


leads to 


or, equivalently 


d_ 

dy [ 




dy dy d9 sin 9 d9 ’ 
+ n(n + 1 )y = 0, 


(1 - d- 2 )y" ~ 2 yy' + n(n + 1 )y = 0, 


which is Legendre’s equation. Since a physically meaningful solution must in this 
case be finite at /! = ±1 (the north and south poles of the sphere), the solution, 
to within an arbitrary multiplicative constant, is the Legendre polynomial, y{y) = 
P n {y), and hence 0 n (0) = P„(cos 9). We have introduced the subscript n for 0 so 
that we can specify that this is the solution corresponding to a particular choice of 
n. Note that we have just solved our first boundary value problem. Specifically, we 
have found the solution of Legendre’s equation that is bounded at /./, = ±1. 

We must now consider the solution of the equation for R(r). This is 


d 

dr 



n(n + 1 )R. 


By inspection, by the method of Frobenius or by noting that the equation is in- 
variant under the transformation group (r, R) i— > (A r, A R) (see Chapter 10), we can 
find the solution of this equation as 


Rn{r ) = A n r n + B n r~ x ~ n , 


where again the subscript n denotes the dependence of the solution on n and A n 
and B n are arbitrary constants. At the centre of the sphere, the temperature must 
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be finite, so we set B n = 0. Our solution of Laplace’s equation therefore takes the 
form 

T = A n r n P n (cos 9) . 


As this is a solution for arbitrary n, the general solution for the temperature will 
be a linear combination of these solutions, 


T= Y, A n rnp n (COS 9). 

n — 0 

The remaining task is to evaluate the coefficients A n . This can be done using the 
specified temperature on the surface of the sphere, T = To sin 4 9 at r = 1. We 
substitute r = 1 into the general expression for the temperature to get 

OO 

f(9) = T 0 sin 4 9 = ^ A n P n ( cos 9). 

n = 0 

The A n will therefore be the coefficients in the Fourier-Legendre expansion of the 
function f(9) = To sin 4 9. 

It is best to work in terms of the variable fi = cos 9. We then have 

OO 

T 0 (l-/r 2 ) 2 = ^A„P„( A i). 

n = 0 

From (2.9), 

A n = 2n 2 1 J T o(l - H 2 ) 2 P n (^)d^. 


Since the function that we want to expand is a finite polynomial in /i, we expect 
to obtain a finite Fourier-Legendre series. A straightforward calculation of the 
integral gives us 


A 0 = 1 - 16 T 0 , A\ = 0, A 2 = - 5 -— T 0 , A 3 = 0, A 4 = 9 -— To, 
2 15 ’ ’ 2 105 ’ ’2 315 ’ 


A m =0 for to = 5, 6, ... . 

The solution is therefore 

T = T 0 | — To (cos 9) - —r 2 P 2 (cos9) + —r 4 P 4 (cos9) | . (2.14) 

This solution when To = 1 is shown in Figure 2.3, which is a polar plot in a plane 
of constant (j). Note that the temperature at the centre of the sphere is 8T 0 /15. We 
produced Figure 2.3 using the MATLAB script 


ezmeshC ’r*cos(t) ’ , ’r*sin(t) ’ , 1 8/15-8*r~2* (3*cos (t) "2-1) /21+ . . . 
8*r~4*(35*cos(t)~4/8-15*cos(t)~2/4+3/8)/35 , , [010 pi]) 

V J 
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The function ezmesh gives an easy way of plotting parametric surfaces like this. 
The first three arguments give the x, y and z coordinates as parametric functions 
of r and t = 0, whilst the fourth specifies the ranges of r and t. 



Fig. 2.3. The steady state temperature in a uniform sphere with surface temperature 
sin 4 #, given by (2.14). 


2.6.2 Fluid Flow 

Consider a fixed volume V bounded by a surface S within an incompressible 
fluid. Although fluid flows into and out of V, the mass of fluid within V remains 
constant, since the fluid is incompressible, so any flux out of V at one place is 
balanced by a flux into V at another place. Mathematically, we can express this as 


u • ndS = 0, 


where u is the velocity of the fluid, n is the outward unit normal to S, and hence 
un is the normal component of the fluid velocity out through S. If the fluid velocity 
field, u, is smooth, we can use the divergence theorem to rewrite this statement of 
conservation of mass as 


V • udV = 0. 


As this applies to an arbitrary volume V within the fluid, we must have V • u = 0 
throughout the flow. If we also suppose that the fluid is inviscid (there is no 
friction as one fluid element flows past another), it is possible to make the further 
assumption that the flow is irrotational (V x u = 0). Physically, this means that 
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there is no spin in any fluid element as it flows around. Inviscid, irrotational, 
incompressible flows are therefore governed by the two equations V • u = 0 and 
V x u = O.f For simply connected flow domains, V x u = 0 if and only if u = V0 for 
some scalar potential function <p , known as the velocity potential. Substituting 
into V • u = 0 gives V 2 </> = 0. In other words, the velocity potential satisfies 
Laplace’s equation. As for boundary conditions, there can be no flux of fluid into 
a solid body so that u-n = n- V0 = <9</>/<9n = O where the fluid is in contact with 
a solid body. 

As an example of such a flow, let’s consider what happens when a sphere of radius 
r = a is placed in a uniform stream, assuming that the flow is steady, inviscid and 
irrotational. The flow at infinity must be uniform, with u = U'l where i is the 
unit vector in the x-direction. First of all, it is clear that the problem will be best 
treated in spherical polar coordinates. We know from the previous example that 
the bounded, axisymmetric solution of Laplace’s equation is 

OO 

<t> = Y, iM" + V ' “) Pn(cos6). (2.15) 

n— 0 


The flow at infinity has potential (p = Ux = U r cos 6. Since Pi (cos 6) = cos 9, we 
see that we must take A\ = 1 and A n = 0 for n > 1. To fix the constants B n , 
notice that there can be no flow through the surface of the sphere. This gives us 
the boundary condition on the radial velocity as 


u r 



at r = a. 


On substituting (2.15) into this boundary condition, we find that Pi = |a 3 and 
P„ = 0 for n > 1. The solution is therefore 


(f>= U 



cos 9. 


(2.16) 


The streamlines (continuous curves that are tangent to the velocity vector) are 
shown in Figure 2.4 when a = U = 1. In order to obtain this figure, we note that 
(see Section 7.2.2) the streamlines are given by 


ip = U 



sin 9 = 


Uy 1 


{x 2 + y 2 ) 3/ \ 


We can then use the MATLAB script 


constant. 


linspace(0,2,200) ; 


x = linspace(-2,2,400) ; i 
[X Y] = meshgrid(x,y) ; 

Z = Y.*(1-1./(X. ~2+Y. ~2) . ~ (3/2) ) ; 
l v = linspace (0 , 2 , 15) ; contour (X,Y,Z,v) 


f We will consider an example of viscous flow in Section 6.4. 
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The command meshgrid creates a grid suitable for use with the plotting command 
contour out of the two vectors, x and y. The vector v specifies the values of ip for 
which a contour is to be plotted. 



Fig. 2.4. The streamlines for inviscid, irrotational flow about a unit sphere. 


In order to complete this example, we must consider the pressure in our ideal 
fluid. The force exerted by the fluid on a surface S by the fluid outside S is purely 
a pressure, p , which acts normally to S. In other words, the force on a surface 
element with area dS and outward unit normal n is —pn dS. We would now like to 
apply Newton’s second law to the motion of the fluid within V, the volume enclosed 
by S. In order to do this, we need an expression for the acceleration of the fluid. 
Let’s consider the change in the velocity of a fluid particle between times t and 
t + St, which, for small St, we can Taylor expand to give 

f r) 11 dx 

u(x(t + St),t + St) - u(x(t), t) = St < — (x, t) + — • Vu(x, t) 


= St 


<9u 

dt 


(x, t) + (u(x, t) • V) u(x, t) 


since u = dx/dt. This means that the fluid acceleration is 


Du 

~Dt 


<9u 

~dt 


T (u • V) u. 


where D/Dt is the usual notation for the material derivative, or time derivative 
following the fluid particle. 

We can now use Newton’s second law on the fluid within S to obtain 

/ s - pnds = X'W dK 

After noting that 


/ pndS = i / (pi) • n dS + j / (pj)-ndS' + 
' S J S J s 


k / (pk) • n dS, 
Js 
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where i, j, k are unit vectors in the coordinate directions, the divergence theorem, 
applied to each of these integrals, shows that 

P ndS = i f ^dV+j [ ^fdV + k f %dV= [ VpdV, 

Jy OX Jy Oy Jy OZ Jy 

and hence that 

X{^+ v »} dF = a 

Since V is arbitrary, 



Du 1 

— = — Vp, 

Dt p 


and for steady flows 


(u.V) u = — Vp. 

P 


(2.17) 


We earlier used the irrotational nature of the flow to write the velocity field as the 
gradient of a scalar potential, u = V</>. Since (u-V)u = V(|u-u)— ux (V x u), 
for irrotational flow (u • V) u = V (|u • u) and we can write (2.17) as 


1. 


and hence 


v -ivr =--vp 


- + hv0i 2 ) =o, 

p 2 


which we can integrate to give 

~ + \ IW| 2 = C, (2.18) 

P 2 

which is known as Bernoulli’s equation. Its implication for inviscid, irrotational 
flow is that the pressure can be calculated once we know the velocity potential, </>. 
In our example, (j) is given by (2.16), so that 


|V</>r - <tfr + - U 2 |^1 - cos 2 0+^(r+^ r ^J sin 2 9 

and Bernoulli’s equation gives 

cos2e+ y ( r+ ^) sin20 

where we have written p^ for the pressure at infinity. This expression simplifies on 
the surface of the sphere to give 

P\r—a =Poo+ \pU 2 ^1 - j Sin 2 fj'j . 

This shows that the pressure is highest at 9 = 0 and n, the stream-facing poles of 
the sphere, with p = p^ + \pU 2 , and drops below p^ over a portion of the rest of 


P = Poo+ 2 P u 2 
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the boundary, with the lowest pressure, p = p^ — \pU 2 , on the equator, 9 = 7r/2. 
The implications of this and the modelling assumptions made are discussed more 
fully by Acheson (1990). 

2.7 The Associated Legendre Equation 

We can also look for a separable solution of Laplace’s equation in spherical polar 
coordinates (r, 9, </>), that is not necessarily axisymmetric. If we seek a solution of 
the form y(9)Q(<j))R(r), we find that $ = e* m ^ with m an integer. The equation 
for y is, after using the change of variable x = cos 0, 

a + {.*. + 1) - p-i.) 

This equation is known as the associated Legendre equation. It trivially reduces 
to Legendre’s equation when m = 0, corresponding to separable axisymmetric 
solutions of Laplace’s equation. However, the connection is more profound than 
this. 

If we define 

y = (1 - x 2 )~ m/2 y, 

Y is the solution of 

(1 - x 2 )Y" - 2 (to + 1 )xY' + (n - m)(n + m + 1 )Y = 0. (2.20) 

If we write Legendre’s equation in the form 

(1 - x 2 )Z" - 2 xZ' + n(n + 1)Z = 0, 

which has solution Z = AP n (x) + BQ n {x ), and differentiate m times, we get 

(1 - x 2 ) [Z^]" - 2 (to + 1 )x[Z^}' + (n - m)(n + m + 1 )Z (m) = 0, (2.21) 

where we have written = d m Z/dx m . A comparison of (2.20) and (2.21) shows 
that 

d m 

Y = dx ™ I APn + B Qn{x)\ , 

and therefore that 

{ f l m j m 'j 

^dx^ Pn ^ + B dx™® n ^\ ' 

We have now shown that the solutions of the differential equation (2.19) are 

y = AP?{x)+BCC{x), 

where 

r l m a m 

P?{x) = (l-x 2 r/ 2 —P n (x), Q™{x) = (l-x 2 ) m / 2 —Q n (x). 

The functions P™{x) and Q™{x) are called the associated Legendre functions. 
Clearly, from what we know about the Legendre polynomials, P™{x) = 0 if m > n 
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and Q™(x) is singular at x = ±1. It is straightforward to show from their definitions 
that 

Pi(x) = (1 — x 2 ) 1 / 2 = sin0, P 2 1 (;r) = 3x(l — x 2 ) 1 / 2 = 3 sin 9 cos 9 = sin 29, 


P|( x) = 3(1 — a; 2 ) = 3 sin 2 0 = ^(1 — cos 29), 

P£(x) = | (5a: 2 - 1)(1 - a: 2 ) 1/2 = | (sin 6 + 5 sin 39 ) , 

2 8 

where x = cos 0. 

There are various recurrence formulae and orthogonality relations between the 
associated Legendre functions, which can be derived in much the same way as those 
for the ordinary Legendre functions. 


Example: Spherical harmonics 

Let’s try to find a representation of the solutions of Laplace’s equation in three 
dimensions, 


d 2 u d 2 u d 2 u 

dx 2 dy 2 dz 2 ’ 

that are homogeneous in x, y and z. By homogeneous we mean of the form x l y^ z k . 
It is simplest to work in spherical polar coordinates, in terms of which Laplace’s 
equation takes the form 

d ( ndu\ 1 d ( . ,.du'\ 1 d 2 u 

V lb) + ^6 06 { sm d0 ) + Sin 2 0 w = ' 


A homogeneous solution of order n = i + j + k in ( x , y , z) will look, in the new 
coordinates, like u = r n S n {9,(j)). Substituting this expression for u in Laplace’s 
equation, we find that 


1 d 
sin 6 86 



1 d 2 S n 
sin 2 9 d(j) 2 


+ n(n + 1 )S n = 0. 


Separable solutions take the form 


S n = (. A m cos mcl) + B m sin m(f))F{9 ) , 
with m an integer. The function F{9) satisfies 


1 d 

sin 9 dO 


dF 

sin 9— ) + n(n + 1) — 
d6 1 1 


sin 2 9 


F = 0. 


If we now make the usual transformation, p = cos 9, F(9) = y(p), we get 
(1 - p 2 )y" - 2 py + jn(n + 1) - J^p 2 } V = °’ 
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which is the associated Legendre equation. Our solutions can therefore be written 
in the form 

u = r n (A m cos rruj) + B m sin m<j)){C ntm P ™ (cos 9) + D n , m Q ™ (cos 0)}. 

The quantities r n cos m<t> P™ (cos 9), which appear as typical terms in the solution, 
are called spherical harmonics and appear widely in the solution of linear bound- 
ary value problems in spherical geometry. We will see how they arise in a quantum 
mechanical description of the hydrogen atom in Section 4.3.6. 


Exercises 


2.1 


2.2 

2.3 


2.4 


2.5 


2.6 

2.7 


Solve Legendre’s equation of order two, (1 — x 2 )y" — 2 xy' + 6y = 0, by the 
method of Frobenius. What is the simplest solution of this equation? By 
using this simple solution and the reduction of order method find a closed 
form expression for Qi(x). 

Use the generating function to evaluate (a) P' n { 1), (b) P„(0). 

Prove that 

(a) P' n+ iO) - P' n -\{ x ) = (2 n + l)P„(x), 

(b) (1 - x 2 )P' n {x) = nP n -i(x) - nxP n {x). 

Find the first four nonzero terms in the Fourier-Legenclre expansion of the 
function 

... . f 0 for -1 < x < 0, 

/( * ) = \l for 0 < a; < 1. 

What value will this series have at x = 0? 

Establish the results 

f 1 2 n 

(a) J xP n (x)P n -i(x)dx = ^ for n = 1,2, . . . , 

(b) J P n (x)P' n+l {x)dx = 2 for n = 0, 1, . . . , 

/ 1 2 n 

xP' n {x)P n {x)dx = ^ + 1 for n = 0, 1, 

Determine the Wronskian of P n and Q n for n = 0, 1, 2, . . . . 

Solve the axisymmetric boundary value problem for Laplace’s equation, 

V 2 T = 0 for 0 < r < a, 0 < 9 < 7r, 

T(a,9) = 2 cos 5 9. 


2.8 


Show that 


(a) P 2 (x) 

(b) Pl(x) 


15x(l — x 2 ), 

^(7x 3 -3x)(l-x 2 ) 1/2 . 
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2.9 * Prove that \P' n {x)\ < n 2 and \P”(x)\ < n 4 for — 1 < x < 1. 

2.10 Derive Equation (2.10). 

2.11 * Find the solution of the Dirichlet problem, V 2< 1> = 0 in r > 2 subject to 
$ — > 0 as r — > oo and 4>(2, 9, <j>) = sin 2 9 cos 2 <f>. 

2.12 * The self-adjoint form of the associated Legendre equation is 

{(1 - x2 ) p n{x)} + { n(n + 1) - P?(x) = 0. 

Using this directly, prove the orthogonality property 
J P[ a (x)P// l (x)dx = 0 for l ^ n. 

Evaluate 

J i \FZ{x )} 2 dx. 

2.13 (a) Suppose P n (xo) = 0 for some Xo G (—1, 1). Show that Xq is a simple 

zero. 

(b) Show that P n with n ^ 1 has n distinct zeros in (—1, 1). 

2.14 Project A simplified model for the left ventricle of a human heart is pro- 
vided by a sphere of time-dependent radius R = R(t) with a circular aortic 
opening of constant area A , as shown in Figure 2.5. During contraction 
we suppose that the opening remains fixed whilst the centre of the sphere 
moves directly toward the centre of the opening and the radius R(t) de- 
creases accordingly. As a result, some of the blood filling the ventricle 
cavity is ejected though the opening with mean speed U = U{t) into the 
attached cylindrical aorta. This occurs sufficiently rapidly that we can as- 
sume that the flow is inviscid, irrotational, incompressible, and symmetric 
with respect to the aortal axis. 

(a) State a partial differential equation appropriate to the fluid flow for 
this situation. 

(b) During contraction, show that a point on the surface of the ventricle 
has velocity 

Rn — (Ra) i, 

where n is the outward unit normal at time t, i is the unit vector 
in the aortic flow direction and a = cosa(f). Show that having 
(i?sina) 2 = f? 2 ( 1 — a 2 ) constant gives 


d (j) ( Us for a < s < 1, 

dn \ -ff(l — s/a ) for — 1 < s < a, 


where s = cos 9 with 9 the usual spherical polar coordinate. 
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Fig. 2.5. A simple model for the left ventricle of the human heart. 


(c) Show that, for a solution to exist, f(s)ds = 0. Deduce that 
R = [a(a — 1 )/(a + 1)}U, which relates the geometry to the mean 
aortal speed. 

(d) Let V = V(t) denote both the interior of the sphere at time t and 
its volume. The total momentum in the direction of i is the blood- 
density times the integral 

1= f Vcj>-idV = [ (s<j>)\ r=R dS, 

Jv Js 

where S is the surface of V . Hence show that 

°° f 1 4 q r 1 

I = 2nR 2 '^2c n J sP n (s)ds = -nR 2 Ci = -V I sf(s)ds. 
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(e) Use the answer to part (c) to show that 4/ = (1 — a)(a 2 + 4 + 3a)Vt/. 

(f) Explain how this model could be made more realistic in terms of the 
fluid mechanics and the physiology. You may like to refer to Pedley 
(1980) for some ideas on this. 



CHAPTER THREE 


Bessel Functions 


In this chapter, we will discuss a class of functions known as Bessel functions. 
These are named after the German mathematician and astronomer Friedrich Bessel, 
who first used them to analyze planetary orbits, as we shall discuss later. Bessel 
functions occur in many other physical problems, usually in a cylindrical geometry, 
and we will discuss some examples of these at the end of this chapter. 

Bessel’s equation can be written in the form 


2 d2 V . dy ( 2 2 \ n f o -n 

‘sAd 1 -’)*' 1 ' < 31 > 

with v real and positive. Note that (3.1) has a regular singular point at x = 0. 
Using the notation of Chapter 1, 


xQ x 2 i x 2 R 
~P ““ ^ “ ’ ~p~ 


x 2 (x 2 — v 2 ) 


2 2 
= X — V , 


both of which are polynomials and have Taylor expansions with infinite radii of 
convergence. Any series solution will therefore also have an infinite radius of con- 
vergence. 


3.1 The Gamma Function and the Pockhammer Symbol 

Before we use the method of Frobenius to construct the solutions of Bessel’s equa- 
tion, it will be useful for us to make a couple of definitions. The gamma function 
is defined by 


noo 

r(®)= / e~ q q x ~ 1 dq, for x > 0. 
Jo 


(3.2) 


Note that the integration is over the dummy variable q and x is treated as constant 
during the integration. We will start by considering the function evaluated at x = 1. 
By definition, 


'dq = 1. 


We also note that 


r(i) = f 

JO 

/»oo 

r(x+l)= / e~ q q x dq , 
Jo 
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which can be integrated by parts to give 


r(x + 1) = 


-q x e~ q 


+ 1 e- q xq x ~ 1 dq = x I e- q q x ~ 1 dq = xT(x). 


J o 


-In*- 1 rtn - 


(3.3) 


We therefore have the recursion formula 

r(x + 1) = xr(x). 

Suppose that x = n is a positive integer. Then 

T(n + 1) = nT(n) = n(n — l)r(n — 1) = • • • = n(n — 1) ... 2 • 1 = n\. (3.4) 

We therefore have the useful result, T(n + 1) = n! for n a positive integer. 

We will often need to know T (1/2). Firstly, consider the definition, 


e q q x ! 2 dq. 


If we introduce the new variable Q = v fq, so that dQ = \q l ^ 2 dq, this integral 
becomes 


= 2 / e 
Jo 


-Q* 


dQ. 


We can also write this integral in terms of another new variable, Q = Q, to obtain 

2 


r I - 


= 2 


o-Q ‘ 


dQ 


to 


o-Q ‘ 


dQ . 


to 


Since the limits are independent, we can combine the integrals as 

2 


r 


= 4 


r»oo poo 


/0 JO 


o-(Q 2 +Q 2 ) 


dQ dQ. 


If we now change to standard polar coordinates we have dQ dQ = r dr dO , where 
Q = r cos 9 and Q = r sin 9 , and hence 


r l - 


/»7t/2 cOO 


= 4 


r dr d9. 


7e=o 7r=0 

The limits of integration give us the positive quadrant of the ( Q , Q)-plane, as re- 
quired. Performing the integration over 9 we have 

2 noo 


Til 


= 2i r 


re r dr, 


and integrating with respect to r gives 

Mar-* 

Finally, we have 

r ( ; ) =2 


2 e 


J 0 


G ) =2 l 


(3.5) 
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We can use r(x) = T(x + l)/x to define r(x) for negative values of x. For 
example, 

r H)- £t qr i =- jr G)— 

We also find that r(x) is singular at x = 0. From the definition, (3.2), the integrand 
diverges like 1/q as q — > 0, which is not integrable. Alternatively, r(x) = r(x + l)/x 
shows that r(x) ~ 1/x as x — * 0. Note that the gamma function is available in 
MATLAB as the function gamma. 


The Pockhammer symbol is a simple way of writing down long products. It 
is defined as 

(a) r = a(a + l)(a + 2) . . . (a + i — 1), 

so that, for example, (a)i = a and (a ) 2 = 0(0 + 1), and, in general, (a) r is a 
product of r terms. We also choose to define (a)o = 1. Note that (l) ra = n\. A 
relationship between the gamma function and the Pockhammer symbol that we will 
need later is 


r(x) (x) n = r(x + n) (3.6) 

for x real and n a positive integer. To derive this, we start with the definition of 
the Pockhammer symbol, 

(x) n = x(x + l)(x + 2) . . . (a; + n — 1). 


Now 


r(a;) (x) n = r(a;) {x(x + l)(a; + 2) . . . (x + n — 1)} 


= {r(cc)a;} {(x + l)(x + 2) . . . (x + n — 1)} . 
Using the recursion relation (3.3), 

r(x) (x) n = r(x + 1) {(x + l)(x + 2) . . . (x + n — 1)} . 
We can repeat this to give 

r(x) (x) n = r(x + n — 1) (x + n — 1) = T(x + n). 


3.2 Series Solutions of Bessel’s Equation 

We can now proceed to consider a Frobenius solution, 

OO 

y(x) = a nX n+c - 

n — 0 

When substituted into (3.1), this yields 

OO OO 

x 2 Yj a n{n + c)(n + c — l)x n+c ^ 2 + x Y^ a n{n + c)x n+c ^ 1 

n — 0 n — 0 
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oo 

+{x 2 - V 2 ) a n x n+c = 0. 

n — 0 

We can now rearrange this equation and combine corresponding terms to obtain 

OO OO 

Y «n{(" + C ) 2 - + Y a nX n+C+ 2 = 0. 

n — 0 n— 0 

We can extract two terms from the first summation and then shift the second one 
to obtain 

ao(c 2 — v 2 )x c + ai{(l + c) 2 — ^ 2 }a: c+1 

OO 

+ 5>„{(n + c) 2 - ^ 2 } + a n . 2 \x n+c = 0. (3.7) 

n — 2 

The indicial equation is therefore c 2 — v 2 = 0, so that c = ±v. 

We can now distinguish various cases. We start with the case for which the 
difference between the two roots of the indicial equation, 2v, is not an integer. Using 
Frobenius General Rule II, we can consider both roots of the indicial equation at 
once. From the second term of (3.7), which is proportional to x c+1 , we have 

ai{(l ± v) 2 — v 2 } = ai(l ± 2v) = 0, 

which implies that a\ = 0 since 2v is not an integer. The recurrence relation that 
we obtain from the general term of (3.7) is 

a n {(n± v) 2 - v 2 ) + a n -2 = 0 for n = 2, 3, . . . , 

and hence a n = 0 for n odd. Note that it is not possible for the series solution 
to terminate and give a polynomial solution, which is what happens to give the 
Legendre polynomials, which we studied in the previous chapter. This makes the 
Bessel functions rather more difficult to study. 

We will now determine an expression for the value of a n for general values of the 
parameter v. The recurrence relation gives us 

(In — 2 

n(n ± 2 v) ' 

Let’s start with n — 2, which yields 

_ Qq 

0,2 ~ 2(2 ± 2v) ’ 

and now with n = 4, 

_ a 2 

“ 4 ~ 4(4 ± 2v ) ' 

Substituting for a 2 in terms of a o gives 

_ ao ao 

“ 4 _ (4 ± 2v)(2 ±2v)-A-2~ 2 2 (2 ± u)(l ± v)2 2 (2 ■ 1) ' 
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We can continue this process, and we find that 

oo 


ain = (~l) r 


2 2n (1 ± v) n n! ’ 


(3.8) 


where we have used the Pockhammer symbol to simplify the expression. From this 
expression for a 2 n, 


y{ x) = a 0 x ±l/ ]T(-l) r 


n — 0 


2 2n (1 ± v) n n\ 


With a suitable choice of Oq we can write this as 


y{x) = A 


„±v 




2±>T(1 ± i/) ' (1 ±v) n n\ 


These are the Bessel functions of order ±za The general solution of Bessel’s equa- 
tion, (3.1), is therefore 


y{x) = AJ u (x) + BJ- v (x), 


for arbitrary constants A and B , with 




x ±v 

2 ±v T(l±v) 


E(-d 


n — 0 


(x 2, /A) n 

(1 ± u) n n! ' 


(3.9) 


Remember that 2v is not an integer. 

Let’s now consider what happens when v = 0, in which case the indicial equation 
has a repeated root and we need to apply Frobenius General Rule III. By setting 
v = 0 in the expression (3.8) for a n and exploiting the fact that (l) n = nl, one 
solution is 

oo .j / 2\ n 

= B-D-py (t) • 

n— 0 \ J \ / 


Using Frobenius General Rule III, we can show that the other solution is 

y °(*) = J °( x )i°g^E(-i)"0S (t) ’ 

which is called Weber’s Bessel function of order zero. This expression can be 
derived by evaluating dy/dc at c = 0. Note that we have made use of the function 
4>(n) defined in (1.21). 

We now consider the case for which 2v is a nonzero integer, beginning with 2v 
an odd integer, 2 n + 1. In this case, the solution takes the form 


y(x) — AJ n+1 / 2 (x) + BJ_ n _ 1 / 2 (x). 

As an example, let’s consider the case v = | so that Bessel’s equation is 

d 2 y ldy ( 1\ _ n 


dx 2 


■ dx 


4x 2 
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We considered this example in detail in Section 1.3.1, and found that 

. . cos x sin x 

yW = a °^Tj2+ ■ 

This means that (see Exercise 3.2) 


^ 1/2 (*^) 



J— 1/2 (*t) 



The recurrence relations (3.21) and (3.22), which we will derive later, then show 
that J n +\/ 2 {x) and J_ n _ 1 / 2 ( 2 ;) are products of finite polynomials with sinx and 
cos a;. 

Finally we consider what happens when 2z/ is an even integer, and hence v is an 
integer. A rather lengthy calculation allows us to write the solution in the form 


y = AJ v (x) + BY„(x), 


where Y„ is Weber’s Bessel function of order v defined as 


Y v {x) 


J v (x) COS 1S7T J_„ (x) 
sin wit 


(3.10) 


Notice that the denominator of this expression is obviously zero when v is an integer, 
so this case requires careful treatment. We note that the second solution of Bessel’s 
equation can also be determined using the method of reduction of order as 


y( x) = AJ v {x) + BJ u (x) 



dq. 


In Figure 3.1 we show Ji(x) for / = 0 to 3. Note that Jo(0) = 1, but that J;(0) = 0 
for i > 0, and that {()) = 0 for j < i, i > 1. We generated Figure 3.1 using the 
MATLAB script 


x=0 : 0 . 02 : 20 ; 
subplot (2,2,1) , 
subplot (2, 2, 2) , 
subplot (2, 2, 3) , 
subplot (2,2,4) , 


plot(x,besselj (0,x)) , 
plot(x,besselj (l,x)) , 
plot(x,besselj (2,x)) , 
plot(x,besselj (3,x)) , 


title ( ’ J_0 (x) ’ ) 
title ( ’ J_1 (x) ’ ) 
title ( ’ J_2 (x ) ’ ) 
title(’ J_3(x ) ’) 


We produced Figures 3.2, 3.5 and 3.6 in a similar way, using the MATLAB functions 
bessely, besseli and besselk. 

In Figure 3.2 we show the first two Weber’s Bessel functions of integer order. 
Notice that as x — > 0, Y n ( x) — > — 00 . As you can see, all of these Bessel functions 
are oscillatory. The first three zeros of Jq{x) are 2.4048, 5.5201 and 8.6537, whilst 
the first three nontrivial zeros of Ji(x) are 3.8317, 7.0156 and 10.1735, all to four 
decimal places. 
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J 0 (X) J,<x) 




J 2 (X) J 3 (x) 




Fig. 3.1. The functions Jo(x), Ji(x), J^ix) and Js(x). 


Y 0 (x) Y,(x) 




Fig. 3.2. The functions Yq(x) and li(x). 


3.3 The Generating Function for J n (x ), n an integer 

Rather like the Legendre polynomials, there is a simple function that will generate 
all of the Bessel functions of integer order. In order to establish this, it is useful to 
manipulate the definition of the J v (x). From (3.9), we have, for v = n, 


J n {x) = 


£ 


(-* 74 )’ 


OO 

\ ' 


(-l)V 


r 2 i+n 


2 n r(l + n) ^ z!(l + n)i 


Z_, 2 2! + Il f!(l + n) i r(l + n)' 









3.3 THE GENERATING FUNCTION FOR J n (x), n AN INTEGER 


Using (3.6) we note that T(1 + n)( 1 + n)i = T(1 + n + i), and using (3.4) we find 
that this is equal to (n + i ) ! . For n an integer, we can therefore write 


Jn{%) — 'y ' 


irjSli+n 




(-1) 

2 2i + n i\(n + i )\ ' 


Let’s now consider the generating function 


1 


g(x,t) = exp<{ ^x ( t- j 


The series expansions of each of the constituents of this are 


(3.11) 


(3.12) 


, /n , A (xt/2)* 

exp (xt/2) = E — 77 — , exp 


j! 


OO 

(-s)=i: 


i-x/2 ty 


j = 0 - i=0 

both from the Taylor series for e x . These summations can be combined to produce 

(-l)V *■/,./ ' 


OO OO 


<?(M) = EE 


2 i+H\j\ 


j = 0 i=0 

Now, putting j = i + n so that — oo ^ n ^ oo, this becomes 

(-i y x 2i+n 'I 


g(x,t)= E lE 


2 2i + n i\(n + i)\ 


t n . 


n=— oo i=0 

Comparing the coefficients in this series with the expression (3.11) we find that 

g(x,t) = expEa; ft- Ej = E J n (x)t n . (3.13) 

We can now exploit this relation to study Bessel functions. 

Using the fact that the generating function is invariant under 1 1 — > — 1/t, we have 

E J n(x)t n = E J ^ X A~t) = E Ux){-l) n t~ n . 

n=— oo n— — oo ' ' n=— oo 

Now putting to = — n, this is equal to 

— OO 

E J-m(x)(-l) m t m . 

m = oo 

Now let n = to in the series on the right hand side, which gives 

OO OO 

E Jn{x)t n = E J-n(x){-l) n t n . 
n=— oo n=— oo 

Comparing like terms in the series, we find that J„( x) = (— 1 ) n J- n (x), and hence 
that J n (x) and J- n (x ) are linearly dependent over R (see Section 1.2.1). This 
explains why the solution of Bessel’s equation proceeds rather differently when v is 
an integer, since J v [x] and J_„( x) cannot then be independent solutions. 
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The generating function can also be used to derive an integral representation of 
J n (x). The first step is to put t = e ±l8 in (3.13), which yields 

OO 

e ±ix sin 6 = j^) + ^( ±1 )n { + (_1 ) n e ~™ S } J n ( X ) , 
n = 1 

or, in terms of sine and cosine, 

OO OO 

e ±txsm6 _ _|_ 2 ^ J 2 n{x) COS 2 n9 ± 2 ^ J 2n+ ±(x) Sm(2?l + 1)9. 

n= 1 n=0 

Appropriate combinations of these two cases show that 

OO 

cos(ccsin0) = Jo(x) + 2^ J 2n {x) cos2n0, (3-14) 

n= 1 

oo 

sin(xsin0) = 2 ^ J 2n +i{x) sin (2 n + 1) 9, (3.15) 

n— 0 

and, substituting rj = ^ — 9, we obtain 

OO 

cos(a; cos rj) = J 0 (x) + 2 ^(— 1)™ J 2n (x) cos2nr], (3.16) 

n— 1 


oo 

sin(xcos? 7 ) = 2 y^(-l)"J 2w+ i(x) cos (2?i + 1) tj. (3.17) 

n—0 


Multiplying (3.16) by cosmij and integrating from zero to 7 r, we find that 


/ cos ?nr/ cos(x cos rj) dy = 
^ 77=0 

Similarly we find that 


sin ?n ?7 sin (a: cos rj) dr/ = 


> //—(I 


TTj m (x) for to even, 
0 for m odd. 


0 for to even, 
TrJ m (x) for m odd, 


from (3.17). Adding these expressions gives us the integral representation 


i r 

J n (x) = — / cos (nO — xsin#) dO. (3.18) 

7T Jo 

We shall now describe a problem concerning planetary motion that makes use 
of (3.18). This is the context in which Bessel originally studied these functions. 

Example: Planetary motion 

We consider the motion of a planet under the action of the gravitational force ex- 
erted by the Sun. In doing this, we neglect the effect of the gravitational attraction 
of the other bodies in the solar system, which are all much less massive than the 
Sun. Under this approximation, it is straightforward to prove Kepler’s first and 
second laws, that the planet moves on an ellipse, with the Sun at one of its foci, 
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and that the line from the planet to the Sun (PS in Figure 3.3) sweeps out equal 
areas of the ellipse in equal times (see, for example, Lunn, 1990). Our aim now 
is to use Kepler’s first and second laws to obtain a measure of how the passage of 
the planet around its orbit depends upon time. We will denote the length of the 



Fig. 3.3. The elliptical orbit of a planet, P, around the Sun at a focus, S. Here, C is the 
centre of the ellipse and A and A' the extrema of the orbit on the major axis of the ellipse. 


semi-major axis, A'C, by a and the eccentricity by e. Note that distance of the Sun 
from the centre of the ellipse, SC, is ea. We also define the mean anomaly to be 


/x = 2ir 


Area of the elliptic sector ASP 
Area of the ellipse 


which Kepler’s second law tells us is proportional to the time of passage from A to 

P. 

Let’s now consider the auxiliary circle, which has centre C and passes through 
A and A ' , as shown in Figure 3.4. We label the projection of the point P onto the 
auxiliary circle as Q. We will also need to introduce the eccentric anomaly of P, 
which is defined to be the angle ACQ , and can be written as 

Area of the sector ACQ 

4 > = 27 r— — ; . 

Area of the auxiliary circle 

We now note that, by orthogonal projection, the ratio of the area of ASP to that 
of the ellipse is the same as the ratio of the area of ASQ to that of the auxiliary 
circle. The area of ASQ is given by the area of the sector ACQ ( \(f>a 2 ) minus the 
area of the triangle CSQ (|ea 2 sin^), so that 


/i ^a 2 <f> — \ea? sin <j> 

1 o 

7T t)7T(U 


and hence 


H = <f> — e sin tf>. 


(3.19) 


Now, in order to determine <f> as a function of /x, we note that <f> — is a periodic 
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Fig. 3.4. The auxiliary circle and the projection of P onto Q. 


function of /z, which vanishes when P and Q are coincident with A or A' , that is 
when /i is an integer multiple of 7r. Hence we must be able to write 

OO 

cj) — n = A n sin n/z. (3.20) 

n = 1 

As we shall see in Chapter 5, this is a Fourier series. In order to determine the 
constant coefficients A n , we differentiate (3.20) with respect to n to yield 


dcj) , vA 
— 1=/ nA n cos till. 

d n fr; 


We can now exploit the orthogonality of the functions cos n/z to determine A n . We 
multiply through by cos to/z and integrate from zero to zr to obtain 




cos ?n/z d/z 



dcj) 

cos inn— 
cln 


dn 


7TTO . 


Since (f> = 0 when /z = 0 and <fi = n when n = 7 r, we can change the independent 
variable to give 


7rm 

~T~ m 


COS TO/Z d<t>. 


i=0 


Substituting for n from (3.19), we have that 


2 W 

A m = / cos m((f) — e sin ^) dcj). 

mn J 0=o 

Finally, by direct comparison with (3.18), A. m = so that 


3.4 DIFFERENTIAL AND RECURRENCE RELATIONS 


69 


Since /i is proportional to the time the planet takes to travel from A to P, this 
expression gives us the variation of the angle <j) with time. 


3.4 Differential and Recurrence Relations Between Bessel Functions 

It is often useful to find relationships between Bessel functions with different indices. 
We will derive two such relationships. We start with (3.9), multiply by x v and 
differentiate to obtain 

_d_ _ d (-i)n x 2 n+ 2 v | ~ (-1)" (2n + 2v) x 2n+2u - 1 

dx v X dx | 2 2n+l/ n! T(1 + v + n) | 2 2n+L ' n\ T(1 + v + n) 

Since r(l + v + n) = (n + v)T(n + v), this gives a factor that cancels with the term 
2 (n + v) in the numerator to give 

A, x -j W1 = f- (-irAlAAA 

it 1 ” 11 Z.2 2 "+— in!r(i/ + n)' 

n= 0 7 

We can rewrite this so that we have the series expansion for J„_i(x), as 

— \x v J fee'll = — V 1 j 

dx [ A n 2— : 1 2 2n n\^{v)(v) n , 

so that 

4jr {x v J v (x)} = X v J v -i(x). (3.21) 

dx 

Later, we will use this expression to develop relations between general Bessel func- 
tions. Note that by putting v equal to zero 

S{Jo(x)} = J- 1 (x). 
dx 

However, we recall that J n (x) = (—l) n J- n (x) for n an integer, so that J\(x) = 
— J_ \{x) and hence 

J o( x ) = ~M X )r 

where we have used a prime to denote the derivative with respect to x. 

In the same vein as the derivation of (3.21), 

— (x~ u JJx)) = — ( x ~ VxV Y 

dx v dx y 2 1 T(1 + v) n! (1 + v) n J 

d /y, (-l) n x 2n \_ y, (-1 ) n x 2n ~ x n 
~ dx \ ^ 2 2n+v n! T(1 + v + n) j ^ 2 2n + u ~ 1 n! r(l + v + n) ‘ 

Notice that the first term in this series is zero (due to the factor of n in the numer- 
ator), so we can start the series at n = 1 and cancel the factors of n. The series 
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can then be expressed in terms of the dummy variable m = n — 1 as 

rl 1 TfL (_l\m+l T 2m+1 

— (x~ v J fa;')') = 1 - 

dx y vy ,> 2*'- 1 ^ 2 2m + 2 ?n!r(i/ + m + 2) 

m= 0 v 7 


2^+i 2 2m m! r(is + m + 2) ’ 

771—0 v 7 

Using the fact that T(z/ + m + 2) = T(z/ + 2) (y + 2) m and the series expansion of 
J„+i(x), (3.11), we have 

d j ( u _ x V (“^V 4 )™ 

dx 1 ' 1 ^ ^ 2 "+ 1 r(i/ + l) ^ rn !(2 + F) m ’ 

x 7 m=0 7 


and consequently 


{x v J v (x)} = -x V J V+ i(x). 


Notice that (3.21) and (3.22) both hold for Y v (x) as well. 

We can use these relationships to derive recurrence relations between the 
Bessel functions. We expand the differentials in each expression to give the equa- 
tions 

^1/00 Y JvipO — dt/— l(x), 
x 

where we have divided through by x", and 

d l/ (x) d^(x) — Jv -\- 1 (*^) ? 

x 

where this time we have multiplied by x v . By adding these expressions we find that 


Ji /(•*-) — 2 Jv+l(x)} ■ 


and by subtracting them 


di/(x) — Jv— i(*£) 7 d r I /_(_i(x), 
x 

which is a pure recurrence relationship. 

These results can also be used when integrating Bessel functions. For example, 
consider the integral 


I = J xJq(x) dx. 


This can be integrated using (3.21) with v = 1, since 


I = J xJq(x) dx = J (xJi(x))' dx = xJi(x). 
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3.5 Modified Bessel Functions 

We will now consider the solutions of the modified Bessel equation, 

2 d2 y , dy 2 2 n /o oca 

x-r^ + x- {x+v)y = 0, (3.23) 

ax z ax 

which can be derived from Bessel’s equation using x i— > ix, so that the solutions 
could be written down as conventional Bessel functions with purely imaginary ar- 
guments. However, it is more transparent to introduce the modified Bessel func- 
tion of first kind of order v , I v (x ), so that the complete solution of (3.23) is 

y(x) = AI v (x ) + 

provided v is not an integer. As was the case when we introduced the function Y u (x), 
there is a corresponding function here K v (x) , the modified Bessel function of 
second kind of order v. This is defined as 

Note that the slight difference between the definition of this Bessel function and 
that of Weber’s Bessel function of order v, (3.10), occurs because these functions 
must agree with the definition of the ordinary Bessel functions when v is an integer. 
Most of the results we have derived in this chapter can be modified by changing the 
argument from x to ix in deriving, for example, the recurrence relations. The first 
few modified Bessel functions are shown in Figures 3.5 and 3.6. Note the contrast 
in behaviour between these and the Bessel functions J v (x) and Y v (x). 

Equations (3.21) and (3.22) also hold for the modified Bessel functions and are 


given by 


^ {x u I v (x)} = x"I u -i(x), 

^ {x V I v {x)} = X V I v+ \{x) 

and 


-j- {x u K v (x)} = -x v K v _i{x), 
ax 

^ {x~ u K v (x)} = -x~"K v+ i(x) 


3.6 Orthogonality of the Bessel Functions 

In this section we will show that the Bessel functions are orthogonal, and hence can 
be used as a basis over a certain interval. This will then allow us to develop the 
Fourier-Bessel series, which can be used to represent functions in much the same 
way as the Fourier-Legendre series. 

We will consider the interval [0, a] where, at this stage, a remains arbitrary. We 
start from the fact that the function J v (x) satisfies the Bessel equation, (3.1), and 
make a simple transformation, replacing x by Ax, where A is a real constant, to 
give 

7 2 7 

x 2 — t J u {Ax) + x—J u (Ax) + ( A 2 x 2 — v 2 ) J v (Ax) = 0. (3.24) 

ax z ax 

We choose A so that J v (Aa) is equal to zero. There is a countably infinite number 
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Fig. 3.5. The modified Bessel functions of first kind of order zero to three. Note that 
they are bounded at x = 0 and monotone increasing for x > 0, with /„(*) ~ e x /V 27t x as 
x — > oo. 


K 0 (x) K,(x) 




Fig. 3.6. The modified Bessel functions of the second kind of order zero and order one. 
Note that they are singular, with Kq(x) ~ — log* and K n (x) ~ 2 n_1 (n — 1)!/*” for n a 
positive integer, as x — ► 0. These functions are monotone decreasing for * > 0, tending to 
zero exponentially fast as x — > oo. 


of values of A for which this is true, as we can deduce from Figure 3.1. Now choose 
A so that J v {na) = 0. Of course J„(/x x) also satisfies (3.24) with A replaced by 
/r. We now multiply (3.24) through by J v {^x) /x and integrate between zero and 
a, which yields 



x~-^J u {\x) 

ax z 


-T-J v ( Ax) + -(A 2 * 2 — v 2 )J u {\x) \ dx = 0. 
dx x 1 


(3.25) 
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Notice that we could have also multiplied the differential equation for J„(/x x) by 
J„(Ax)/x and integrated to give 

r a f d 2 d 1 'l 

/ J u (Xx) \ x~— J u {/j,x) + — J„(/xx) + -(/z 2 x 2 - v 2 ) (/ix) ^ dx = 0. (3.26) 

J 0 ( dx z dx x ) 

We now subtract (3.25) from (3.26) to give 

I (xJ-J v (Xx)\ -J u (Xx)-J (x-*^Jv(iix) 

J q dx y dx J dx y dx 


+ x(A 2 — fj 2 )J ly (Ax)J l y(fj J x) >dx = 0. 


(3.27) 


We have simplified these expressions slightly by observing that xJ” + J' v = (x J')'. 
Now consider the first term of the integrand and note that it can be integrated by 
parts to give 


[ J v {n x)J- ( xj- J„(Xx) ) dx = 
Jo dx y dx J 


J v bux)x — J v (Ax) 
dx 


J o 


— f J„(\x) J v (fix)dx. 
J q dx dx 


Similarly for the second term, which is effectively the same with /x and A inter- 
changed, 


1 = 


Jv(\x)x — Jv{nx) 
dx 


J 0 


— f x— JAux) — J u (Xx)dx. 

J q dx dx 

Using these expressions in (3.27), we find that 

(A 2 — /.t 2 ) / xJ„(Ax) J„(iix)dx = J„(Aa)a/x J(,(/xa) — J„(/xa)aA J(,(Aa). (3.28) 

Jo 

Finally, since we chose A and fi so that J„(Aa) = Jj,(/xa) = 0 and /x A, 

[° 

/ x J,,(/xx) J„(Ax)<ix = 0. 

Jo 

This is an orthogonality relation for the Bessel functions, with weighting function 
w(x) = x. We now need to calculate the value of x Jj,(/xx) 2 dx. To this end, we 
substitute A = /x + e into (3.28). For e€l, neglecting terms in e 2 and smaller, we 
find that 


f a 

— 2/xe / xJ^/xx) J„(/xx + ex) dx 

Jo 


= a + ea)Jl(/.ia ) — (/x + e) Jj,(/xa) J(,(/xa + ea)] . (3.29) 

In order to deal with the terms evaluated at x = a( ^ + e) we consider Taylor series 
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expansions, J„(a(/z + e)) = J„(a/z) + eaJ' v {a n) + • • • , J(,(a(/r + e)) = J' v {ap) + 
eaJ”(afj.) + • • • . These expansions can then be substituted into (3.29). On dividing 
through by e and considering the limit e — > 0, we find that 

f a Q 1 CL 

J x[J v (n a:)] 2 dx = — [J' u {^a)\ 2 - -a 2 J v {(ia)J"{tia) - — J^/za) J' (/ia). 

We now suppose that J u {[m) = 0, which gives 

J x[J„(fJ,x)] 2 dx = y[J'0H] 2 - 


In general, 


r° a 2 

J xJ u (fj,x) J u (Xx) dx = — [J' v (pLa)] 2 8 XlM 


where J u (pa) = J„(Xa) = 0 and S \ M is the Kronecker delta function. 
We can now construct a series expansion of the form 

OO 

f{x) = J2 C iM>*x)- 


(3.30) 


(3.31) 


This is known as a Fourier— Bessel series, and the A are chosen such that 
Jy(Aja) = 0 for i = 1,2, ... , Ai < A 2 < • • • . As we shall see later, both f(x) 
and f'{x) must be piecewise continuous for this series to converge. After multiply- 
ing both sides of (3.31) by xJ u (Xjx) and integrating over the interval [0,a] we find 
that 

P a C a °° 

/ xJ„(Xjx)f(x)dx= / xJ v (Xjx) Y, CjJ v { Xjx) dx. 

Jo ' Jo i=1 

Assuming that the series converges, we can interchange the integral and the sum- 
mation to obtain 

pa °° pa 

/ xJ l/ (Xjx)f(x)dx = ^^Ci / xJjy(Xjx) J„(\ix) dx. 

Jo -1 Jo 


i = 1 


We can now use (3.30) to give 


i 00 2 \ 2 

xJ v (Xjx)f(x) dx = ^ J C i a - [J' u (XjCi)} 2 S XjXi = Ci ^~ [J'(Aja)] 2 , 


i = 1 


and hence 


Cj = 


a 2 [Jl(Xjd)] Jo 


xJ v {Xjx)f{x) dx. 


(3.32) 


Example 

Let’s try to expand the function f(x) = 1 on the interval 0 ^ x ^ 1, as a Fourier- 
Bessel series. Since Jo(0) = 1 but Jj( 0) = 0 for i > 0, we will choose v to be zero for 
our expansion. We rely on the existence of a set of values X :j such that Jq ( Xj ) = 0 
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for j = 1,2,... (see Figure 3.1). We will need to determine these values either from 
tables or numerically. 

Using (3.32), we have 

2 f 1 

c i = rT „. , l2 / xJ 0 {\jx) dx. 

Ko( A i)] Jo 

If we introduce the variable y = A jX (so that dy = Xjdx), the integral becomes 

2 1 f Xj 

Cj = T7u\~Ti2 / yJo(y)dy. 

[^o( A i)] A i Jy=o 

Using the expression (3.21) with v — 1, we have 


Cj = r , i2 T2 / ~T~ {vMv)) ^ 

[Jo( A ,0] 4= o d y 


and hence 


2 

\ 2 [<m-)] 2 


[y-My)] 


0 


2Ji( A j) 

A i W( A i )] 2 


2 

A j Ji( A j) 


2 

V ' t/ 'x Jo( a ^) = 1 for 0 ^ x < 1, 

i=l 


where Jo( A i) = 0 for * = 1, 2, . . . .In Figure 3.7 we show the sum of the first fifteen 
terms of the Fourier-Bessel series. Notice the oscillatory nature of the solution, 
which is more pronounced in the neighbourhood of the discontinuity at x = 1. This 
phenomenon always occurs in series expansions relative to sequences of orthogonal 
functions, and is called Gibbs’ phenomenon. 

Before we can give a MATLAB script that produces Figure 3.7, we need to be 
able to calculate the zeros of Bessel functions. Here we merely state a couple of 
simple results and explain how these are helpful. The interested reader is referred 
to Watson (1922, Chapter 15) for a full discussion of this problem. The Bessel— 
Lommel theorem on the location of the zeros of the Bessel functions J„( x) states 
that when — A < v ^ | and rrm < x < ( m + |)7 r, J u (x) is positive for even m and 
negative for odd to. This implies that J v (x) has an odd number of zeros in the 
intervals (2 n — 1)7t/2 < x < 2mr for n an integer. In fact, it can be shown that the 
positive zeros of Jo (a;) lie in the intervals n7r + 37r/4 < x < mr + 7n/ 8. This allows 
us to use the ends of these intervals as an initial bracketing interval for the roots 
of Jq{x). A simple MATLAB script that uses this result is 
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Fig. 3.7. Fourier-Bessel series representation of the function f(x) = 1 on the interval [0, 1) 
(truncated after fifteen terms). The expansion is only valid in the interval x £ [0, 1). The 
series is shown for x > 1 only to demonstrate that convergence is only guaranteed for the 
associated interval. 


together with the function 


function bessel = bessel(x); 
global nu 

bessel = besselj (nu,x) ; 


J 


The MATLAB function f zero finds a zero of functions of a single variable in a given 
interval. By defining the variables nu and ze as global, we make them available 
to other functions. In particular, this allows us to use the computed positions of 
the zeros, ze, in the script below. 

The zeros of J\ (x) interlace those of Jq(x). We can see this by noting that 
Ji(x) = — Jq(x) and that both functions are continuous. Consequently the zeros of 
Jo(x) can be used as the bracketing intervals for the determination of the zeros of 
J\(x). A MATLAB script for this is 
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The zeros of the other Bessel functions can be found by exploiting similar analytical 
results. 

We can now define a MATLAB function 



which can be plotted with ezplot (@f ourierbessel , [0 2]) to produce Figure 3.7. 


3.7 Inhomogeneous Terms in Bessel’s Equation 

So far we have only concerned ourselves with homogeneous forms of Bessel’s equa- 
tion. The inhomogeneous version of Bessel’s equation, 

2 d 2 y dy 2 2 

a; — + x— + 0 - v )y = f{x), 
ax z ax 

can be dealt with by using the technique of variation of parameters (see Section 1.2). 
The solution can be written as 

y(x) = J — /(,) (J v (s)Y v (x) - J v (x)Y v (s)) ds + AJ u (x) + BY v (x). (3.33) 

Here we have made use of the fact that the Wronskian associated with J v (x) and 
Y v (x) is 2v /x sin j/7t, which can be derived using Abel’s formula, (1.7). The constant 
can then be found by considering the behaviour of the functions close to x = 0 (see 
Exercise 3.5). 
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This can be determined by using (3.33) with f(x) = x 2 and v = 0, so that the 
general solution is 

r x 7 rs 

y(x) = J — (J 0 {s)Y 0 {x) - J 0 (x)Y 0 {s)) ds + AJ 0 (x ) + BY 0 (x). 

In order to integrate sJo(s) we note that this can be written as (sJi(s))' and 
similarly for sYo(s)> which gives 

7TX 

y(x) = — (J\{x)Yq(x) - J 0 {x)Y 1 {x)) + AJ 0 ( x) + BY 0 (x). 

But we note that J\(x) = —J' 0 (x) and Yi(x) = —Yq(x), so that the expression in 
the brackets is merely the Wronskian, and hence 

y{x) = 1 + AJ 0 ( x) + BY 0 ( x). 

Although it is clear, with hindsight, that y(x) = 1 is the particular integral solution, 
simple solutions are not always easy to spot a priori. 


Example 


Let’s find the particular integral of 


2 d ~V , „ d V 


2 “ y I | / 2 

X -~ 1 r+X— + (X -v )y = X. 
dx z dx 


(3.34) 


We will look for a series solution as used in the method of Frobenius, namely 

OO 

y(x ) = ^ a nX n+C - 


n— 0 


Substituting this into (3.34), we obtain an expression similar to (3.7), 
ao(c 2 — v 2 )x c + ai{(l + c) 2 — u 2 }x c+1 


+ '^2,[a n {{n + c) 2 - v 2 } + a n - 2 ]x n+c = x. 

n = 2 

Note that x c needs to match with the x on the right hand side so that c = 1 and 
ao = 1/(1 — v 2 ). We will defer discussion of the case v = 1. At next order we find 
that a i(2 2 — v 2 ) = 0, and consequently, unless v = 2, we have Oi = 0. For the 
general terms in the summation we have 

a n - 2 

“ n- ~{(?r+ l) 2 -i/ 2 }' 

Note that since a\ = 0, a n = 0 for n odd. It now remains to determine the general 
term in the sequence. For n = 2 we find that 

a 0 _ 1 

a2 “ 32 _ ^2^ ( l2 — i/2) (32 — z/2) 

and then with n = 4 and using the form of a 2 we have 


1 


<24 = 


(l 2 -^2)(32_^2)( 5 2_ l ,2)- 
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The general expression is 


n+1 


02 n 


= (-ir n 


11 (2i — l) 2 — v 2 

i — 1 v 7 


This can be manipulated by factorizing the denominator and extracting a factor of 
2 2 from each term in the product (of which there are n + 1). This gives 


1 


n+1 


^n- 


i 


(-+ 


i 


a2n = 92n+2 II ,• _ I , \ v 11 ,• _ I _ I.. - 0 2n + 2 (1 + 1 y\ (l - 

Z i=l 1 2 + 2+=+ 2 2^ Z l2+2 l/ ;„+ll2 2 I 'J„+1 

Hence the particular integral of the differential equation is 

(- 1 )" x ^ 

t'o (I + \ V )n + 1 (I - H„+l 


2/(+) = 


(3.35) 


In fact, solutions of the equation 


2 d y , ~dy 




(3.36) 


are commonly referred to as and are called Lommel’s functions. They are 
undefined when /.t ± n is an odd negative integer, a case that is discussed in depth 
by Watson (1922). The series expansion of the solution of (3.36) is 


(~l) m (\x) 2m+2 T (\n - \v+ |)T (|//+ h v + s) 


^ £) r (^-^ + m +l) r (^+^ + m+|) 


We can use this to check that (3.35) is correct. Note that we need to use (3.6). 


3.8 Solutions Expressible as Bessel Functions 

There are many other differential equations whose solutions can be written in terms 
of Bessel functions. In order to determine some of these, we consider the transfor- 
mation 

y( x) = x a y(x 0 ). 

Since a and f3 could be fractional, we will restrict our attention to a; ^ 0. We 
substitute this expression into the differential equation and seek values of a and (3 
which give Bessel’s equation for y. 


Example 

Let’s try to express the solutions of the differential equation 


d 2 y 

dx 2 


— xy = 0 


in terms of Bessel functions. This is called Airy’s equation and has solutions 
Ai(a;) and Bi(a;), the Airy functions. We start by introducing the function y. 
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Differentiating with respect to x we have 

^ =ax OL - 1 y + /3x°‘ +l3 - 1 y'. 
ax 

Differentiating again we obtain 

Tf = a(a - 1 )x a ~ 2 y + {2a(3 + (3 2 ^ ff)x a+& ~ 2 y' + p 2 x a+2/3 ~ 2 y" . 
dx z 

These expressions can now be substituted into Airy’s equation to give 

a(a - 1 )x a ~ 2 y + (2 a/3 + 0 1 - /3)x a+0 ~ 2 y + (3 2 x a+20 ~ 2 y” - x a+1 y = 0. 

It is now convenient to multiply the entire equation by x~ a+2 / 0 1 (this means that 
the coefficient of y" is x 2/3 ), which gives 


x 2 Py" + + ■ 


~w x 


i/a — 1) 

~w~ 


y = o. 


Considering the coefficient of y we note that we require x 3 oc x 2 @ which gives /3 = | . 
The coefficient of y' gives us that a = \ . The equation is now 


2/3 -// i ( 3 — / i 
x + x H y + 




V 3 ) 


y = o, 


which has solutions Ki/ 3 {2x 3 ^ 2 /3) and I 1 / 3 {2x 3 ^ 2 /3). The general solution of Airy’s 
equation in terms of Bessel’s functions is therefore 

= x 1 ' 2 {ak i/s , (1^) + bi 1/3 (|*»^ } . 

In fact, Ai(x) = ^x/3K 1 / 3 {2x 3 ^ 2 /3)/tt. A graph of this Airy function is shown in 
Figure 11.12. The Airy functions Ai and Bi are available in MATLAB through the 
function airy. 


3.9 Physical Applications of the Bessel Functions 
3.9.1 Vibrations of an Elastic Membrane 

We will now derive the equation that governs small displacements, 2 = z(x, y, t ), 
of an elastic membrane. We start by considering a small membrane with sides 
of length bS x in the ^-direction and 6S y in the y-direction, which makes angles 
V’xj + and ifry, ipy + St/y with the horizontal, as shown in Figure 3.8. Newton’s 
second law of motion in the vertical, ^-direction gives 

^ 3pSS x 6S y = 6S y {T sin (ip x + 6ip x ) - T sin (tp x )} + SS X {T sin (ip y + 8ip y ) - T sin (tp y )}, 

where p is the constant density (mass per unit area) of the membrane and T is the 
tension, assumed constant for small vibrations of the membrane. We will eventually 
consider the angles ip x and x/> y to be small, but at the outset we will consider the 
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T 



Fig. 3.8. A small section of an elastic membrane. 


changes in these angles, 6i/j x and 8ip y , to be smaller. Accordingly we expand the 
trigonometric functions and find that 


p d 2 z 
Tdf 2 


cos ip x 


Hx 

6S X 


+ COS V’y 


6Sy 


where we have divided through by the area of the element, 8S x 8S y . We now consider 
the limit as the size of the element shrinks to zero, and therefore let 6S X and 8S y 
tend to zero. Consequently we find that 


p d 2 z 
Tdi 2 


cos ip x 


dip x 

dS x 


+ COS 


8^ 

dSy- 


(3.37) 


We can now use the definition of the partial derivatives, 


tanipx = — , tan ip y = 


dz 

dy 


By differentiating these expressions with respect to x and y respectively we find 
that 


sec 2 i/j x 


d^ x 

dx 


d 2 z 
dx 2 ’ 


sec 2 ip y 


dtp v 

dy 


d 2 z 

dy 2 


For small slopes, cos ip x and sec 2 ip x are both approximately unity (and similarly for 
variation in the y direction) . Also, using the formula for arc length we have 


dS x 



dx, 
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when \dz/dx\ <C 1. Similarly dS y « dy when \dz/dy\ <C 1. Consequently 

dif x _ dxjj x dipy ~ dip v 
dS x dx ’ dS y dy 

Combining this information yields the governing equation for small deflections 2 = 
z(x,y,t) of an elastic membrane, 


p d 2 z d 2 z d 2 z 
T dt 2 dx 2 dy 2 


(3.38) 


the two-dimensional wave equation. We will define appropriate boundary con- 
ditions in due course. At this stage we have not specified the domain of solution. 


One-Dimensional Solutions of the Wave Equation 

We will start by considering the solution of this equation in one dimension. If 
we look for solutions of the form z = z(x,t), independent of y , as illustrated in 
Figure 3.9, we need to solve the one-dimensional wave equation, 


d 2 z _ 2 d 2 z 
dt 2 C dx 2 1 


(3.39) 


where c = \JT / p. This equation also governs the propagation of small-amplitude 
waves on a stretched string (for further details, see Billingham and King, 2001). 
The easiest way to solve (3.39) is to define new variables, known as characteristic 



Fig. 3.9. A one-dimensional solution of the two-dimensional wave equation. 


variables, £ = x — ct and rj = x + ct (see Section 7.1 for an explanation of where 
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these come from). In terms of these variables, (3.39) becomes 

d 2 z 
d^dr] 

Integrating this with respect to g gives 


= 0 . 


r)z 


and with respect to £ 


z = j F{s) ds + g{ri) = /(£) + /( ?? ), 


and hence we have that 

z{x,t) = f(x — ct) + g(x + ct). (3.40) 

This represents the sum of two waves, one, represented by /( x — ct), propagating 
from left to right without change of form at speed c, and one, represented by 
g(x+ct), from right to left at speed c. To see this, consider, for example, the solution 
y = f(x — ct), and simply note that on the paths x = ct + constant, f(x — ct) is 
constant. Similarly, g(x + ct) is constant on the paths x = — ct + constant. 

The functions / and g can be determined from the initial displacement and 
velocity. If 

dz . 


z(x, 0) = zo(x), —(x,0) = u 0 (x), 
then (3.40) with t = 0 gives us 

f(x) +g{x) = z 0 {x). 

If we now differentiate (3.40) with respect to time, we have 

dz f f 

0 ^ = -cf'(x - ct) + eg (x + ct), 

and hence when t = 0, 

-c/'( x) + cg'(x) = u 0 (x). 

This can be integrated to yield 


(3.41) 


~ c f{x) + cg(x) = / u 0 (s) ds, 
J a 


(3.42) 


where a is an arbitrary constant. Solving the simultaneous equations (3.41) and 
(3.42), we find that 

1 1 f x 1 1 f x 

fix ) = -z 0 (x) - — J u 0 {s) ds, g(x) = -z 0 (x) + — J u 0 {s) ds. 

On substituting these into (3.40), we obtain d’Alembert’s solution of the one- 
dimensional wave equation, 

1 1 r x + ct 

z{x, t) = - {z 0 (x - ct) + z 0 {x + ct)} + — / u 0 (s) ds. (3.43) 

z Zc J x - C t 



84 


BESSEL FUNCTIONS 


In particular, if uo = 0, a string released from rest, the solution consists of a 
left-travelling and a right-travelling wave, each with the same shape but half the 
amplitude of the initial displacement. The solution when zq = 0 for |cc| > a and 
zo = 1 for \x\ < a, a top hat initial displacement, is shown in Figure 3.10. 

i 


t < a/c 




1/2 


1 

1 

1 

I 

1 

f 

I 

l 

1 


-a-ct -a+ct a-ct a+ct 


x 



Fig. 3.10. D’Alembert’s solution for an initially stationary top hat displacement. 


Two-Dimensional Solutions of the Wave Equation 

Let’s now consider the solution of the two-dimensional wave equation, (3.38), for 
an elastic, disc-shaped membrane fixed to a circular support. In cylindrical polar 
coordinates, the two-dimensional wave equation becomes 

1 d 2 z d 2 z 1 dz 1 d 2 z 
c 2 dt 2 dr 2 + r dr r 2 dff 2 

We will look for solutions in the circular domain 0 ^ r ^ a and 0 ^ 9 < 2n. Such 
solutions must be periodic in 6 with period 2n. The boundary condition is 2 : = 0 at 
r = a. We seek a separable solution, 2 = i?(r)r(/)O(0). On substituting this into 
(3.44), we find that 

1 t" _ rR" + R' 1 0" _ oj 2 
c 2 t rR r 2 0 c 2 ’ 
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where — oj 2 /c 2 is the separation constant. An appropriate solution is r = e 1UJt , which 
represents a time-periodic solution. This is what we would expect for the vibrations 
of an elastic membrane, which is, after all, a drum. The angular frequency, u), 
is yet to be determined. We can now write 

r 2 R" + rR' co 2 r 2 0" 2 

r e -- ” 1 

where n 2 is the next separation constant. This gives us 0 = A cos n6 + Bsinnd , 
with n a positive integer for 27r-periodicity. Finally, R(r) satisfies 


2 d 2 R dR 

r h T — 

dr 2 dr 



R = 0. 


We can simplify this by introducing a new coordinate s = Xr where A = ui/c, which 
gives 

2 d R dR / 2 2 \ 7-> r\ 

S 2 -— + s— + (s 2 — n 2 ) R = 0, 
ds z ds 

which is Bessel’s equation with v = n. Consequently the solutions can be written 
as 


R(s) = AJ n (s) + BY n (s). 

We need a solution that is bounded at the origin so, since W(s) is unbounded at 
s = 0, we require B = 0. The other boundary condition is that the membrane is 
constrained not to move at the edge of the domain, so that R(s) = 0 at s = A a. 
This gives the condition 

J n {Xa) = 0, (3.45) 

which has an infinite number of solutions A = A n j. Specifying the value of A = u>/c 
prescribes the frequency at which the membrane will oscillate. Consequently the 
functional form of the natural modes of oscillation of a circular membrane is 

^ = J n (A n jr) (A cos + B sin?70) e lUit , (3.46) 

where u>i = c\ ni and the values of \ ni are solutions of (3.45). Figure 3.11 shows 
a few of these natural modes when a = 1, which we created using the MATLAB 
function ezmesh (see Section 2.6.1). 

Here we have considered the natural modes of oscillation. We could however 
have tackled an initial value problem. Let’s consider an example where the initial 
displacement of the membrane is specified to be z(r , 0, 0) = G(r, 0) and the mem- 
brane is released from rest, so that dz/dt = 0, when t = 0. The fact that the 
membrane is released from rest implies that the temporal variation will be even, so 
we need only consider a solution proportional to cos w,;t. Consequently, using the 
linearity of the wave equation to add all the possible solutions of the form (3.46), 
the general form of the displacement of the membrane is 

OO OO 

*M,f) = ££ Jn{X n ir ) ( A ni cos n0 + B n i sin nd) cos u>it. 

i = 0 n— 0 
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Jq(^i r ) 'V^2 r ) 



Fig. 3.11. Six different modes of oscillation of an elastic membrane. 


Using the condition that z(r,9, 0) = G(r,9) we have the equations 

OO OO 

G(r,9) = EE Jn(^ni T ) (-^-m cos n9 + B n i sin n9) . 


(3.47) 


i — 0 n — 0 


In order to find the coefficients A n i and B n , we need exploit the orthogonality of 
the Bessel functions, given by (3.30), and of the trigonometric functions, using 

p2tv /*27t 

/ cos rn9 cos n9 d9= sin m9 sin n6 d9 = 7r 6 mn , 

J o Jo 

for m and n integers. Multiplying (3.47) through by cos m9 and integrating from 0 
to 27 r we have 


1 


f.2it °° 

/ cos m9G(r,9)d9 = ^A mi J n (\ ni r) 
J o i= o 


and with sinm0 we have 


f 2 -r ^ 

/ sin rn9 G(r, 9) d9 = E 

i = 0 
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Now, using (3.32), we have 


and 


A ■ — 

-^mj — 


Fmj — 


a 2 7T [J' n (\ja)\ J r —0 Jo 


a r-2iz 

rJ n (A jr) / cos md G(r, 9) dd dr 

=o Jo 


a 2 7T [J'n (Aja)]“ Jr—o Jo 


/•2-7T 

rJn(Xjr) / sin mO G(r,8) d6 dr. 

Jo 


Note that if the function G(r, 9) is even, B m j = 0, and similarly if it is odd, A m j = 0. 
The expansion in 9 is an example of a Fourier series expansion, about which we 
will have more to say in Chapter 5. 


3.9.2 Frequency Modulation (FM) 

We will now discuss an application in which Bessel functions occur within the 
description of a modulated wave (for further information, see Dunlop and Smith, 
1977). The expression for a frequency carrier comprises a carrier frequency f c and 
the modulating signal with frequency f m . The phase shift of the carrier is related 
to the time integral of the modulating wave F m (t) = cos 2nf rn t, so that the actual 
signal is 


F c (t) = cos < 2irf c t + 2 ttK 2 / acos(2ir f m t)dt 


which we can integrate to obtain 


F c (t) = cos {2nf c t + /3sin(27r/ m t)} , (3.48) 

where (3 = K 2 a/ f m , a constant called the modulation index. We can expand 
(3.48) to give 

F c (t) = cos(27r/ c f) cos {( 3 sin(27r/ m t)} - sin(27r/ c f) sin {( 3 sin(2-7r f m t)} . (3.49) 

An example of this signal is shown in Figure 3.12. 

We would now like to split the signal (3.49) into its various frequency compo- 
nents. We can do this by exploiting (3.16) and (3.17) to give 

{ OO 

J 0 (/3) + 2 ^(-l) n J 2n ((3) cos 2n(2nf m t) 

n= 1 


+ sin(27r/ c i) 


2 ^(-1)™ J 2n+ i(P) cos (2 n + 1) (2nf m t) 

< n — 0 


Using simple trigonometry, 


(3.50) 


2cos(2nf c t ) cos {(2n)2n f. m t} = 
cos {2nf c t + (2n)2nf m t} + cos {27r/ c f - (2n)27r/ m t} , 
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Fig. 3.12. An example of frequency modulation, with (5 — 1.2, f m = 1.2 and f c = 1. 

and 

2 sin(27r/ c f) sin {(2n + l)27r/ m f} = 

— cos {2irf c t + (2 n + 1)27t f m t} + cos {27r/ c t — (2 n + l)27r/ m f} . 

Substituting these expressions back into (3.50), we find that F c (t) can be written 
as 

F c {t) = J 0 (/3)cos(27r/ c t) 

OO 

+ J 2n(/3) [cos {2 7rf (f c + 2 nf m )} + cos {27Tt (/ 0 - 2 nf m )}] 

n = 1 

oo 

- X] J 2n+i(/3) [cos{27rf (f c - (2n + l)/ m )} - cos {27rf (/ c + (2n+ l)/ m )}] , 

n = 0 

the first few terms of which are given by 

F c {t) = J 0 (f3) cos(2t xf c t) - [cos {2 tt(/ c - f m )t} - cos {2 t t(/ c + / m )t}[ 

+ J 2 (/3) [cos {27t(/ c - 2/ m )t} + cos {2 t t(/ c + 2/ m )t}] 

- ^3(/3) [cos {2 tt(/ c - 3 f m )t} - cos {2 t r(/ c + 3/ m )t}] H . 

This means that the main carrier signal has frequency f c and amplitude Jo(/3), and 
that the other components, known as sidebands, have frequencies f c ± nf m for 
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frequency 


Fig. 3.13. The frequency spectrum of a signal with [5 = 0.2, / m = 1.2 and f c = 1. 

n an integer, and amplitudes J n (/3). These amplitudes, known as the frequency 
spectrum of the signal, are shown in Figure 3.13. 
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3.3 * Show that 

(a) 2J„(x) — Jv — 1 (x) t / l /_|-i(x), 

(b) 2 vJ u (x) = xJ v+ i(x) + xJ l/ _i(x). 

3.4 Determine the series expansions for the modified Bessel functions I v (x) and 

3.5 Find the Wronskian of the functions (a) J v (x) and J_„(x), (b) J„(x) and 

Y v (x) for v not an integer. 

3.6 Determine the series expansion of 

J x M J„(x) dx. 


3.7 


3.8 


3.9 


3.10 


3.11 


When i-i = l and v = 0 show that this is equivalent to x J\ (x) . 

Give the solutions, where possible in terms of Bessel functions, of the dif- 
ferential equations 

d 1 2 y dy 

(b) x 1T2 + T b 3 / = °, 

dx z dx 

(c) x + (x+i) 2 y = o, 

( d ) 

( e ) - o?y = 0, 

ir\ d2 y , n d y j n 

(f) ^ +/3 ^ +72/=0 ’ 


(g) (1 - x 2 )2-f - 2x^~ + n(n + l)y = 0. 


dy 


dx 2 

Using the expression 


dx 


1 /' 271 

Jv(z) = — / cos (yO — zsin0) 

2 7t J o 


show that J„(0) = 0 for v a nonzero integer and J o (0) = 1. 

Determine the coefficients of the Fourier-Bessel series for the function 


f(x) = 


1 for 0 < x < 1, 
—1 for 1 ^ x ^ 2, 


in terms of the Bessel function Jo (x) . 

Determine the coefficients of the Fourier-Bessel series for the function 
/(x) = x on the interval 0 ^ x ^ 1 in terms of the Bessel function Ji(x) 
(and repeat the exercise for J 2 (x)). Modify the MATLAB code used to 
generate Figure 3.7 to check your answers. 

Calculate the Fourier-Bessel expansion for the functions 
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3.12 


3.13 


(a) 


_ f x for 0 ^ x < 1, 

' \ 2 — x for 1 < x ^ 2, 

(b) 

... . f a; 2 + 2 for 0 < x < 1, 
f{x)= { 3 forl<*<3, 

in terms of the Bessel function J\ [x) . 

Construct the general solution of the differential equation 


nd 2 y dy , 9 9 , 

x z —-rr + x— + (a; 2 — v 2 )y = sina;. 

dx z dx 

Project This project arises from attempts to model the baking of food- 
stuffs, and thereby improve the quality of mass-produced baked foods. A 
key element of such modelling is the temperature distribution in the food, 
which, as we showed in Chapter 2, is governed by the diffusion equation, 

dT 

— = V • (DVT) . 


The diffusivity, D, is a function of the properties of the food. 

We will consider some problems associated with the baking of infinitely- 
long, cylindrical, axisymmetric foodstuffs under axisymmetric conditions 
(a first approximation to, for example, the baking of a loaf of bread). In 
this case, the diffusion equation becomes 


dT 

Ot 


\d_ 

r dr 



(E3.1) 


(a) Look for a separable solution, T(r,t) = f(r)e ut , of (E3.1) with 
D{r) = D 0 , a constant, subject to the boundary conditions dT/dr = 
0 at r = 0 and dT /dr = h{T a — T) at r = r 0 . Here, h > 0 is a heat 
transfer coefficient and T a is the ambient temperature. Determine 
the possible values of ui. 

(b) If the initial temperature profile within the foodstuff is T(r, 0) = 
To(r), determine the temperature for t > 0. 

(c) In many baking problems, there are two distinct layers of food. 
When 

= f D 0 for 0 ^ r ^ n, 

' \ Di for n < r < ro, 

and the heat flux and the temperature are continuous at r = ri, 
determine the effect of changing r\ /ro and Dq and D\ on the possible 
values of u. 

(d) Solve the initial-boundary value problem when the initial tempera- 
ture is uniform in each layer, with 


T(r, 0) 


T 0 for 0 ^ r < n, 
Ti for ri < r ^ tq. 
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(e) Find some realistic values of D 0 and h for bread dough, either from 
your library or the Internet. How does the temperature at the centre 
of baking bread vary with time, if this simple model is to be believed? 



CHAPTER FOUR 


Boundary Value Problems, Green’s Functions and 
Sturm— Liouville Theory 


We now turn our attention to boundary value problems for ordinary differential 
equations, for which the boundary conditions are specified at two different points, 
x = a and x — b. The solutions of boundary value problems have some rather 
different properties to those of solutions of initial value problems. In order to 
ground our discussion in terms of a real physical problem, we will consider the 
dynamics of a string fixed at x = y = 0 and x = l, y = 0, and rotating steadily 
about the x-axis at constant angular velocity, as shown in Figure 4.1. This is 
rather like a skipping rope, but one whose ends are motionless. In order to be able 



Fig. 4.1. A string rotating about the x-axis, fixed at x — 0 and x = l. 


to formulate a differential equation that captures the dynamics of this string, we 
will make several assumptions. 



94 


BOUNDARY VALUE PROBLEMS 


(i) The tension in the string is large enough that any additional forces intro- 
duced by the bending of the string are negligible in comparison with the 
tension force. 

(ii) The tension force acts along the local tangent to the string, and is of constant 
magnitude, T. 

(iii) The slope of the string, and hence its displacement from the a:-axis, is small. 

(iv) The effect of gravity is negligible. 

(v) There is no friction, either due to air resistance on the string or to the fixing 
of the string at each of its ends. 

(vi) The thickness of the string is negligible, but it has a constant line density, 
p, a mass per unit length. 


We denote the constant angular velocity of the string about the x-axis by u, 
and the angle that the tangent to the string makes with the a;-axis by the function 
9(x). Working in a frame of reference that rotates with the string, in which the 
string is stationary, Newton’s first law shows that the forces that act on the string, 
including the centrifugal force, must be in balance, and hence that 

T sin 9( x + Sx) — T sin 9(x) = —puj 2 y^/ Sx 2 + Sy 2 , 

as shown in Figure 4.2. If we divide through by Sx and take the limit Sx — > 0 we 
obtain 

t| {s m«M} + ^l+(|) =0. 

By definition, tan# = dy/dx, so, by elementary trigonometry, 


sin 9 = 


dy 

dx 



and hence 





(4.1) 


This equation of motion for the string must be solved subject to y( 0) = y(l) = 
0, which constitutes a rather nasty nonlinear boundary value problem, with no 
elementary solution apart from the equilibrium solution, y = O.f If, however, 
we invoke assumption (iii), that \dy/dx\ <C 1, the leading order boundary value 
problem is 

d 2 v 

+ Xy = 0, subject to y( 0) = y(l) = 0, (4.2) 

where A = pix 2 /T. This is now a linear boundary value problem, which we can 
solve by elementary methods. 

f This nonlinear boundary value problem can, however, easily be studied in the phase plane (see 
Chapter 9). 
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Fig. 4.2. A small element of the rotating string. 


If we look for a solution of the form y = Ae mx , we obtain m 2 + A = 0, and hence 
m = ±i \ 1 / 2 , so that the solution is y = Ae l> ' 1/2x + Be~ lXl/2x . Since y(0) = 0, 
B = — A , and y(l) = 0 gives 


e iA 1/2 i _ e ~iX 1/2 l _ q 


(4.3) 


At this stage, we do not know whether A is real, although, since it represents a 
ratio of real quantities, it should be. If we write A 1 / 2 = a + i/3 and equate real and 
imaginary parts in (4.3), we find that 

( e ~ 131 — e 131 ) cos al = 0, (e - ^ + e •f 31 ') sin al = 0. (4.4) 

From the first of these, we have either e~ l31 — e^ 31 = 0 or cos al = 0, and hence 
either (3 = 0 or al = (n + |) 7r for n an integer. The latter leaves the second of 
equations (4.4) with no solution, so we must have (3 = 0, and hence sin al = 0. This 
gives al = mr with n an integer, which means that A 1 / 2 is real, as expected, with 
A 1 / 2 = n7r /l an d y = A n ( e ln ' KX / 1 — e - '"’ r »/i) = 2iA n s\\\(n’Kx/l). To ensure that y, 
the displacement of the string, is real, we write A n = a n /2i for a n real, which gives 

( T17TX \ 

—j— J forn = 1,2, ... . (4.5) 

Note that the solution that corresponds to n = 0 is the trivial solution, y — 0. 

The values of A for which there is a nontrivial solution of this problem, namely 
A = n 2 n 2 /l 2 , are called the eigenvalues of the boundary value problem, whilst the 
corresponding solutions, y = sinA 1 / 2 ^;, are the eigenfunctions. Note that there is 
an infinite sequence of eigenvalues, which are real and have magnitudes that tend 
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to infinity as n tends to infinity. In terms of the physical problem, for a string of 
given line density p, length l and tension T, a nonequilibrium steady motion is only 
possible at a discrete set of angular velocities, the eigenfrequencies, 

mr 


4.1 Inhomogeneous Linear Boundary Value Problems 

Continuing with our example of a rotating string, let’s consider what happens 
when there is a steady, imposed external force TF(x) acting towards the ai-axis. 
The linearized boundary value problem is then 

^Jr + A y = F (x), subject to y(0) = y(l) = 0. (4.6) 

We can solve this using the variation of parameters formula, (1.6), which gives 

1 f x 

y{x) = Acos^^x + + —j- / F(s) sin \ l ^ 2 {x — s) ds. 

A Jo 

To satisfy the boundary condition y( 0) = 0we need A = 0, whilst y(l) = 0 gives 

1 f 1 

B sin A 1 / 2 ? +—7^ / F’(s) sin A x ^ 2 (x — s) ds — 0. (4.7) 

A /2 Jo 

When A is not an eigenvalue, sinA 1 / 2 ^ ^ 0, and (4.7) has a unique solution 

B = ‘ a L F{a) si ” A ' /2(a] “ s) *• 

However, when A is an eigenvalue, we have sin A 1 / 2 / = 0, so that there is a solution 
for arbitrary values of B , namely 

1 f x 

y(x) = B sin A 1 / 2 ^ + — / F(s) sin A 1 / 2 (a: — s) ds, (4.8) 

A Jo 

provided that 

F(s) sin A 1 / 2 (? — s) ds = 0. (4.9) 

If F(s) does not satisfy this integral constraint there is no solution. 

These cases, where there may be either no solution or many solutions depending 
on the form of F(x), are in complete contrast to the solutions of an initial value 
problem, for which, as we shall see in Chapter 8, we can prove theorems that guar- 
antee the existence and uniqueness of solutions, subject to a few mild conditions 
on the form of the ordinary differential equation. This situation also arises for any 
system that can be written in the form Ax = b, where A is a linear operator. A fa- 
mous result (see, for example, Courant and Hilbert, 1937) known as the Fredholm 
alternative, shows that either there is a unique solution of Ax = b, or Ax = 0 
has nontrivial solutions. 





4.1 INHOMOGENEOUS LINEAR BOUNDARY VALUE PROBLEMS 


97 


In terms of the physics of the rotating string problem, if w is not an eigenfre- 
quency, for arbitrary forcing of the string there is a unique steady solution. If u> 
is an eigenfrequency and the forcing satisfies (4.9), there is a solution, (4.8), that 
is a linear combination of the eigensolution and the response to the forcing. The 
size of B depends upon how the steady state was reached, just as for the unforced 
problem. If w is an eigenfrequency and the forcing does not satisfy (4.9) there is no 
steady solution. This reflects the fact that the forcing has a component that drives 
a response at the eigenfrequency. We say that there is a resonance. In practice, if 
the string were forced in this way, the amplitude of the motion would grow linearly, 
whilst varying spatially like sin \ 1 / 2 x, until the nonlinear terms in (4.1) were no 
longer negligible. 


4.1.1 Solubility 

As we have now seen, an inhomogeneous, linear boundary value problem may 
have no solutions. Let’s examine this further for the general boundary value prob- 
lem 


( p(x)y'(x ))' + q(x)y(x) = f(x) subject to y(a) = y(b) = 0. (4.10) 


If u(x) is a solution of the homogeneous problem, so that (p(x)u'(x))' +q(x)u(x) = 0 
and u(a) = u(b) = 0, then multiplying (4.10) by u(x) and integrating over the 
interval [a, b] gives 


i(x){(p(x)y'(x))' + q(x)y(x)} dx = J u(x)f(x)dx. (4.11) 


Now, using integration by parts, 

r b 


u(x) (p(x)y'( x))' dx = [u(x)p(x)y'(x)] a - / p(x)y'(x)u'(x) dx 


f b 

= — p(x)y'(x)u'(x) dx, 

J a 

using u(a) = u(b) = 0. Integrating by parts again gives 

fb 

— / p{x)y' (x)u r (x) dx 

J a 


= - [ P{x)u' {x)y{x)\ c 


(p{x)v! {x))' y{x) dx = I (p(x)u'(x))' y(x) dx 




using y(a) = y(b) = 0. Substituting this into (4.11) gives 


' y{x)\{p{x)u'{x))' + q{x)u{x)} dx = f u{x)f{x)dx, 


and, since i i(x) is a solution of the homogeneous problem, 

[ u(x)f(x) dx = 0. 


(4.12) 
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A necessary condition for there to be a solution of the inhomogeneous boundary 
value problem (4.10) is therefore (4.12), which we call a solvability or solubility 
condition. We say that the forcing term, /(&), must be orthogonal to the solution 
of the homogeneous problem, a terminology that we will explore in more detail in 
Section 4.2.3. Note that variations in the form of the boundary conditions will give 
some variation in the form of the solvability condition. 


4.1.2 The Green’s Function 

Let’s now solve (4.10) using the variation of parameters formula, (1.6). This 
gives 

nx f(s ) 

y(x) = Au 1 (x) + Bu 2 (x) + —A-{u 1 (s)u 2 (x)-u 1 (x)u 2 (s)}ds, (4.13) 

Js=a Wls) 

where u±(x) and u 2 (x) are solutions of the homogeneous problem and W(x) = 
ui(x)u 2 (x)—u[(x)u 2 (x) is the Wronskian of the homogeneous equation. The bound- 
ary conditions show that 

Au\{a) + Bu 2 (a) = 0, 


Aui(b) + Bu 2 (b ) = J {ui(6)u 2 (s) - ui(s)u 2 (b)} ds. 

Provided that ui(a)u 2 (b) ^ ui(b)u 2 (a), we can solve these simultaneous equations 
and substitute back into (4.13) to obtain 

rb f(s) {ui(b)u 2 (s) - ui(s)u 2 (b)} {ui(a)u 2 (x) - ui(x)u 2 (a)} 


v(x) = [ 

J a 


W(s) 


ui(a)u 2 (b) — ui(b)u 2 (a) 


ds 


+ [ {ui(s)u 2 (x) - ui(x)u 2 (s)} ds. (4-14) 

Ja W(s) 

This form of solution, although correct, is not the most convenient one to use. 
To improve it, we note that the functions Vi(x) = ui(a)u 2 (x) — ui(x)u 2 (a ) and 
v 2 (x) = ui(b)u 2 (x) — ui(x)u 2 (b), which appear in (4.14) as a product, are linear 
combinations of solutions of the homogeneous problem, and are therefore them- 
selves solutions of the homogeneous problem. They also satisfy Ui(a) = v 2 (b) = 0. 
Because of the way that they appear in (4.14), it makes sense to look for a solution 
of the inhomogeneous boundary value problem in the form 

y(x)= f f(s)G(x, s) ds, (4.15) 

J a 

where 

. _ J vi(s)v 2 (x) = G < (x, s) for a ^ s < x, 

^ S \ vi(x)v 2 (s) = Gy(x , s) for x < s < b. 

The function G(x, s) is known as the Green’s function for the boundary value 
problem. From the definition, it is clear that G is continuous at s = x, and that 
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y(a) = y(b) = 0, since, for example, G > (a, s) = 0 and y(a) = f G > (a, s)f(s) ds = 
0. If we calculate the partial derivative with respect to x, we find that 

C 


G x (x, s = x ) — G x (x , s = x + ) = vi(x)v' 2 {x) — v[(x)v 2 (x) = W = 


p{x)' 


where, using Abel’s formula, C is a constant. 

For this to be useful, we must now show that y(x), as defined by (4.15), actually 
satisfies the inhomogeneous differential equation, and also determine the value of 
C. To do this, we split the range of integration and write 

y(x) = / G < (x, s)f(s) ds + / G > (x,s)f(s)ds. 

J a J x 

If we now differentiate under the integral sign, we obtain 

y\x)= f G <iX (x, s)f(s) ds + G<(x, x)f(x) + f G >tX (x, s)f(s) ds — G>(x, x)f(x), 

J a J x 

where G< iX = dG < /dx. This simplifies, by virtue of the continuity of the Green’s 
function at x = s, to give 

y\x)= f G <x (x, s)f(s) ds + f G >x (x,s)f(s)ds 

J a J x 

= / v 1 (s)v 2x {x)f(s) ds + / v lx (x)v 2 (s)f(s)ds, 

J a J x 

and hence 

(py'y = [ v 1 (s)(pv 2x ) x f{s) ds + p(x)v 1 (x)v 2 x(x)f(x) 


+ / (pvix)xV2(s)f(s)ds-p(x)v lx (x)v 2 {x)f(x) 


f-x nb 

= / V 1 (s)(pv 2 x ) x f(s)ds+ / (pvi x ) x v 2 (s)f(s)ds + Cf(x), 

J a J x 

using the definition of C. If we substitute this into the differential equation (4.10), 
( py')' + qy = /, we obtain 

[ vi(s)(pv 2x ) x f(s)ds+ [ (pv lx ) x v 2 (s)f(s)ds + Cf(x) 


+«(*) / Vi (s)v 2 (x)f(s) ds - 


vi(x)v 2 {s)f(s) ds } = /( x). 


Since V\ and v 2 are solutions of the homogeneous problem, the integral terms vanish 
and, if we choose C = 1, our representation provides us with a solution of the 
differential equation. 
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As an example, consider the boundary value problem 

y"(x) - y( x) = f(x) subject to y{ 0) = y( 1) = 0. 

The solutions of the homogeneous problem are e~ x and e x . Appropriate combi- 
nations of these that satisfy Vi(0) = "^(l) = 0 are Vi(x) = A sinh x and v 2 (x) = 
B sinlifl — x), which gives the Green’s function 

, , J AB sinh(l — x) sinh s for 0 < s < x, 

( AB sinh(l — s) sinh a; for x < s ^ 1, 

which is continuous at s = x. In addition, 

G x {x,s = x~) - G x (x,s = x + ) 


= —AB cosh(l — x) sinh x — AB sinh(l — x) cosh x = —AB sinh 1. 
Since p(x) = 1, we require AB = — 1/sinh 1, and the final Green’s function is 

sinh(l — x) sinh s 


G(x , s) 


-ill 1 

sinh(l — s) sinh a: 


for 0 ^ s < x, 


for x < s ^ 1. 


sinh 1 

The solution of the inhomogeneous boundary value problem can therefore be written 


as 


y(x) = - / f(s) 

Jo 


sinh(l — x) sinh s 
sinh 1 


ds- f f(s ) 

J X 


sinh(l — s) sinh x 
sinh 1 


ds. 


We will return to the subject of Green’s functions in the next chapter. 


4.2 The Solution of Boundary Value Problems by Eigenfunction 
Expansions 

In the previous section, we developed the idea of a Green’s function, with which we 
can solve inhomogeneous boundary value problems for linear, second order ordinary 
differential equations. We will now develop an alternative approach that draws 
heavily upon the ideas of linear algebra (see Appendix 1 for a reminder). Before 
we start, it is useful to be able to work with the simplest possible type of linear 
differential operator. 


4.2.1 Self-Adjoint Operators 

We define a linear differential operator, L : C 2 [a,b] —> C[a,b], as being in self- 
adjoint form if 


L = 


d 

dx 



d_ 

dx 


+ q{x) 


(4.16) 
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where p(x) € C 1 [a, b\ and is strictly nonzero for all x € (a,b), and q(x) £ C[a,b], 
The reasons for referring to such an operator as self-adjoint will become clear later 
in this chapter. 

This definition encompasses a wide class of second order differential operators. 
For example, if 


L = a 2 {x)— - + a\(x) — + a 0 (x) 


dx 2 


dx 


(4.17) 


is nonsingular on [a. b] , we can write it in self-adjoint form by defining (see Exer- 
cise 4.5) 


p(x) = exp (^j d?j , q(x) 


ao(x) 

a 2 {x) 


exp 



(4.18) 


Note that p(x) ^ 0 for x £ [a,b], By studying inhomogeneous boundary value 
problems of the form Ly = /, or 

T~ ( p ^7r) + q ^ y = fW’ ( 4 - 19 ^ 

dx \ dx J 


we are therefore considering all second order, nonsingular, linear differential oper- 
ators. For example, consider Hermite’s equation, 


pt~2x d /+Xy = t,, (4.20) 

dx z dx 

for —oo < x < oo. This is not in self-adjoint form, but, if we follow the above 
procedure, the self-adjoint form of the equation is 


d_ 

dx 



+ Xe~ x2 y = 0. 


This can be simplified, and kept in self-adjoint form, by writing u 
obtain 


d 2 u 

dx 2 


(. x 2 — l)it = —Xu. 


a -x 2 /2 


y, to 


(4.21) 


4.2.2 Boundary Conditions 

To complete the definition of a boundary value problem associated with (4.19), 
we need to know the boundary conditions. In general these will be of the form 


oti y{a) + a 2 y(b) + a 3 y'(a) + a±y'(b) = 0, 
Piy(a) + P 2 y(b) + fay' (a) + (3iy'{b) = 0. 


(4.22) 


Since each of these is dependent on the values of y and y' at each end of [a, b], 
we refer to these as mixed or coupled boundary conditions. It is unnecessarily 
complicated to work with the boundary conditions in this form, and we can start 
to simplify matters by deriving Lagrange’s identity. 
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Lemma 4.1 (Lagrange’s identity) If L is the linear differential operator given 
by (4-16) on [a,b], and if yi, 2/2 € C 2 [a,b\, then 

Vi ( Ly 2 ) - 2/2 ( L yi ) = [p (2/12/2 - 2/12/2)]' • ( 4 . 23 ) 


Proof From the definition of L, 


2/i (Lyf) - 2/2 (Lyi) = 2/1 I (py 2 )' + <72/2 


2/2 


(P2/'i)' 


<72/i 


= 2/1 i.py'2)' ~ 2/2 (P 2 /'i)' = 2/1 {py 2 +p'y 2 ) - 2/2 (P 2 /i + p' 2 /'i) 

= p' (2/12/2 - 2/12/2) + P (2/12/2 - 2/i 2/2) = [p (P1P2 ~ 2/12/2)]' • 

□ 


Now recall that the space C[a, b] is a real inner product space with a standard 
inner product defined by 


if, 9)= [ f{x)g{x)dx. 

J a 


If we now integrate (4.23) over [a, b] then 

( yi,Ly 2 ) - (Lyi, y 2 ) = [p (2/12/2 ~ 2 /i 2/2 )]„ ■ ( 4 - 24 ) 


This result can be used to motivate the following definitions. The adjoint operator 
to T, written T, satisfies (■ y\,Ty 2 ) = (Tyi,y 2 ) for all 2/1 and y 2 - For example, let’s 
see if we can construct the adjoint to the operator 


d 2 d 

V= d^ + 1 di + S ' 


with 7, 8 £ R, on the interval [0, 1], when the functions on which T> operates are 
zero at x = 0 and x = 1. After integrating by parts and applying these boundary 
conditions, we find that 


(<f)i,Vf> 2 ) 



<f> 1 (02 + 702 + <502) dx 




0i 02 dx 


70102 


— / 70102 / <50102 dx 


0102 + / 0102 dx — / 70102 dx+ / 60102 dx = (2?01, 02), 


where 
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A linear operator is said to be Hermitian, or self-adjoint, if (' Vi,Ty 2 ) = 
{ Tyi,y 2 ) for all y\ and y 2 . It is clear from (4.24) that L is a Hermitian, or self- 
adjoint, operator if and only if 



= 0 , 


and hence 


P(b) {yi(b)y' 2 (b) - y' 1 (b)y 2 (b)} -p(a) {yi{a)y' 2 {a) - y' 1 (a)y 2 (a)} = 0. (4.25) 

In other words, whether or not L is Hermitian depends only upon the boundary 
values of the functions in the space upon which it operates. 

There are three different ways in which (4.25) can occur. 

(i) p(a) = p(b) = 0. Note that this doesn’t violate our definition of p as strictly 
nonzero on the open interval (a, b). This is the case of singular boundary 
conditions. 

(ii) p(a) = p{b) ^ 0, yi(a) = y,(6) and y[{a) = y[(b). This is the case of periodic 
boundary conditions. 

(iii) a±yi(a) + a 2 y'fa) = 0 and / 3\yi{b ) + /3 2 y[{b) = 0, with at least one of the a,; 
and one of the (3i nonzero. These conditions then have nontrivial solutions 
if and only if 

yi(a)y 2 (a) - y[(a)y 2 (a) = 0, yi(b)y' 2 (b) - y[(b)y 2 (b) = 0, 

and hence (4.25) is satisfied. 

Conditions (iii), each of which involves y and y' at a single endpoint, are called 
unmixed or separated. We have therefore shown that our linear differential 
operator is Hermitian with respect to a pair of unmixed boundary conditions. The 
significance of this result becomes apparent when we examine the eigenvalues and 
eigenfunctions of Hermitian linear operators. 

As an example of how such boundary conditions arise when we model physical 
systems, consider a string that is rotating (as in the example at the start of this 
chapter) or vibrating with its ends fixed. This leads to boundary conditions y(0) = 
y{a ) = 0 - separated boundary conditions. In the study of the motion of electrons 
in a crystal lattice, the periodic conditions p(0) = p(l), y(0) = y(l) are frequently 
used to represent the repeating structure of the lattice. 


4.2.3 Eigenvalues and Eigenfunctions of Hermitian Linear 
Operators 

The eigenvalues and eigenfunctions of a Hermitian, linear operator L are the 
nontrivial solutions of Ly = A y subject to appropriate boundary conditions. 

Theorem 4.1 Eigenfunctions belonging to distinct eigenvalues of a Hermitian lin- 
ear operator are orthogonal. 
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Proof Let y-y and y 2 be eigenfunctions that correspond to the distinct eigenvalues 
Ai and A 2 . Then 


(Lyi,V2) = (Aij/i,y 2 ) = Ai(j/i,y 2 ), 

and 


(yi,Ly 2 ) = (yi,A 2 y 2 ) = A 2 <2/1, 3/2), 

so that the Hermitian property ( Lyi,y 2 } = (yy,Ly 2 ) gives 

(Ai — A 2 ) (2/1 , 2/2} = 0 . 

Since Ai yf A 2 , ( 2 / 1 , 2 / 2 ) = 0, and yi and y 2 are orthogonal. □ 

As we shall see in the next section, all of the eigenvalues of a Hermitian linear 
operator are real, a result that we will prove once we have defined the notion of a 
complex inner product. 

If the space of functions C 2 [a, b] were of finite dimension, we would now argue 
that the orthogonal eigenfunctions generated by a Hermitian operator are linearly 
independent and can be used as a basis (or in the case of repeated eigenvalues, 
extended into a basis). Unfortunately, C 2 [a,b\ is not finite dimensional, and we 
cannot use this argument. We will have to content ourselves with presenting a 
credible method for solving inhomogeneous boundary value problems based upon 
the ideas we have developed, and simply state a theorem that guarantees that the 
method will work in certain circumstances. 


4.2.4 Eigenfunction Expansions 

In order to solve the inhomogeneous boundary value problem given by (4.19) with 
/ £ C[a, b] and unmixed boundary conditions, we begin by finding the eigenvalues 
and eigenfunctions of L. We denote these eigenvalues by Ai, A 2 , . . . , A n , . . . , and 
the eigenfunctions by <j> 2 (x), . . . , < f> n {x ), .... Next, we expand f(x) in terms 

of these eigenfunctions, as 

OO 

f(x) = y 'c n <j> n (x). (4.26) 

71=1 


By making use of the orthogonality of the eigenfunctions, after taking the inner 
product of (4.26) with </) n , we find that the expansion coefficients are 




(/, <ftn) 

{4>m fyn) 


(4.27) 


Next, we expand the solution of the boundary value problem in terms of the eigen- 
functions, as 


OO 

V{X ) = ^2 dn0n{x), 
n = 1 


(4.28) 
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and substitute (4.27) and (4.28) into (4.19) to obtain 


y. d n <l>n{x ) = 22 Cn< ^ n ( x )• 

.n — 1 J n=l 

From the linearity of T and the definition of <f> n this becomes 

OO OO 

y ^ d n A n ^> n (x) = y ( c n cj) n (x) . 


We have therefore constructed a solution of the boundary value problem with d n = 
c n /^ni tf the series (4.28) converges and defines a function in C 2 [a, b\. This process 
will work correctly and give a unique solution provided that none of the eigenvalues 
\ n is zero. When A m = 0, there is no solution if c m ^ 0 and an infinite number of 
solutions if c m = 0, as we saw in Section 4.1. 


Example 

Consider the boundary value problem 

— y" = /( x) subject to y(0) = y(n) = 0. (4.29) 

In this case, the eigenfunctions are solutions of 

y" + Xy = 0 subject to y(0) = y( n) = 0, 
which we already know to be A ra = n 2 , (f> n (x) = sinna;. We therefore write 

OO 

f(x) = 22 CnSinnx, 

71=1 

and the solution of the inhomogeneous problem (4.29) is 

OO 

y( x ) = J22 sinnx - 

n—1 n 

In the case f(x ) = x, 

fg xsinnxdx 2(— l) n+1 

Cn ~Fk i 2 t ? 

J 0 sin nx dx n 

so that 

C_x ) n + 1 

y(x) = 2 > sin nx. 

' n 6 

71=1 

We will discuss the convergence of this type of series, known as a Fourier series, in 
detail in Chapter 5. 

This example is, of course, rather artificial, and we could have integrated (4.29) 
directly. There are, however, many boundary value problems for which this eigen- 
function expansion method is the only way to proceed analytically, such as the 
example given in Section 3.9.1 on Bessel functions. 
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Example 

Consider the inhomogeneous equation 

(1 — x 2 )y" — 2 xy + 2 y = f(x) on — 1 < x < 1, (4.30) 


with / € C\— 1,1], subject to the condition that y should be bounded on [—1,1]. 
We begin by noting that there is a solubility condition associated with this problem. 
If u(x) is a solution of the homogeneous problem, then, after multiplying through 
by u and integrating over [—1,1], we find that 

[u{l-x 2 )y'] 1 _ 1 -[u'(l-x 2 )y\ 1 _ 1 = J u(x)f(x)dx. 


If u and y are bounded on [—1,1], the left hand side of this equation vanishes, so 
that J 1 u(x)f(x)dx = 0. Since the Legendre polynomial, u = P\{x) = x, is the 
bounded solution of the homogeneous problem, we have 


Pi(x)f(x) dx = 0. 




Now, to solve the boundary value problem, we first construct the eigenfunction 
solutions by solving Ly = Xy, which is 

(1 - x 2 )y" - 2 xy' + (2 - A )y = 0. 


The choice 2 — A = n(n + 1), with n a positive integer, gives us Legendre’s equation 
of integer order, which has bounded solutions y n {x) = P n ( x). These Legendre 
polynomials are orthogonal over [—1,1] (as we shall show in Theorem 4.4), and 
form a basis for C\— 1, 1]. If we now write 

OO 

f{x) = E A mPm{x), 

m— 0 

where A\ = 0 by the solubility condition, and then expand y(x) = Y^m=o BmPm(x), 
we find that 


{2 — m(m + 1)} B m = A m for m ^ 0. 


The required solution is therefore 


y{x) 


1 

2 


Aq + BiPi(x) 


E o 

2 — m(m + 1 

m — 2 v 7 


with B\ an arbitrary constant. 

Having seen that this method works, we can now state a theorem that gives the 
method a rigorous foundation. 


Theorem 4.2 If L is a nonsingular, linear differential operator defined on a closed 
interval [a, b] and subject to unmixed boundary conditions at both endpoints, then 
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(i) L has an infinite sequence of real eigenvalues Ao,Ai,... , which can he or- 
dered so that 

|A 0 | < | Ai| < ••• < |A n | < ••• 

and 

lim |A n | = oo. 

n— >oo 

(ii) The eigenfunctions that correspond to these eigenvalues form a basis for 
C[a, b], and the series expansion relative to this basis of a piecewise con- 
tinuous function y with piecewise continuous derivative on [a, b } converges 
uniformly to y on any subinterval of [a, b } in which y is continuous. 

We will not prove this result here.f Instead, we return to the equation, Ly = Ay, 
which defines the eigenfunctions and eigenvalues. For a self-adjoint, second order, 
linear differential operator, this is 

t( vi ' x)d £) +q(x)v=xv ' (43i) 

which, in its simplest form, is subject to the unmixed boundary conditions 

aiy(a) + a 2 y'(a) = 0, /?i y(b) + (3 2 y'{b) = 0, (4.32) 

with af + a% > 0 and (3f + / 3| > 0 to avoid a trivial condition. This is an example 
of a Sturm— Liouville system, and we will devote the rest of this chapter to a 
study of the properties of the solutions of such systems. 


4.3 Sturm— Liouville Systems 

In the first three chapters, we have studied linear second order differential equations. 
After examining some solution techniques that are applicable to such equations in 
general, we studied the particular cases of Legendre’s equation and Bessel’s equa- 
tion, since they frequently arise in models of physical systems in spherical and 
cylindrical geometries. We saw that, in each case, we can construct a set of orthog- 
onal solutions that can be used as the basis for a series expansion of the solution of 
the physical problem in question, namely the Fourier-Legendre and Fourier-Bessel 
series. In this chapter we will see that Legendre’s and Bessel’s equations are exam- 
ples of Sturm— Liouville equations, and that we can deduce many properties of 
such equations independent of the functional form of the coefficients. 


4.3.1 The Sturm— Liouville Equation 

Sturm-Liouville equations are of the form 

(p(x)y'(x))' + q(x)y(x ) = -A r(x)y(x), (4.33) 


f For a proof see Ince (1956). 
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which can be written more concisely as 

Sy(x, A) = -A r(x)y(x, A), (4.34) 

where the differential operator S is defined as 

S4,s ic{ p(x) t) +,,(x)4 ’' (4,35) 

This is a slightly more general equation than (4.31). In (4.33), the number A is 
the eigenvalue, whose possible values, which may be complex, are critically depen- 
dent upon the given boundary conditions. It is often more important to know the 
properties of A than it is to construct the actual solutions of (4.33). 

We seek to solve the Sturm-Liouville equation, (4.33), on an open interval, (a, 6), 
of the real line. We will also make some assumptions about the behaviour of the 
coefficients of (4.33) for x € (a, b), namely that 

(i) p(x), q(x) and r( x) are real- valued and continuous, 

(ii) p(x) is differentiable, (4.36) 

(iii) p(x) > 0 and r(x ) > 0. 

Some Examples of Sturm-Liouville Equations 
Perhaps the simplest example of a Sturm-Liouville equation is Fourier’s equa- 
tion, 

y"(x, A) = -A y(x, A), (4.37) 

which has solutions cos(xVX) and sin(x\/A). We discussed a physical problem that 
leads naturally to Fourier’s equation at the start of this chapter, and we will meet 
another at the beginning of Chapter 5. 

We can write Legendre’s equation and Bessel’s equation as Sturm-Liouville prob- 
lems. Recall that Legendre’s equation is 

d 2 y _ 2x dy A _ 
dx 2 1 — x 2 dx 1 — x 2 ^ ’ 

and we are usually interested in solving this for — 1 < x < 1. This can be written 
as 

((1 - x 2 )y')' = -A y. 

If A = n(n +1), we showed in Chapter 2 that this has solutions P n {x) and Q n (x). 
Similarly, Bessel’s equation, which is usually solved for 0 < x < a, is 

x 2 y" + xy' + (Ax 2 — v 2 )(j) = 0. 

This can be rearranged into the form 

(xy')' ~-y = -A xy. 
x 

Again, from the results of Chapter 3, we know that this has solutions of the form 
J v (x\/ A) and Y v (xV~A). 
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Although the Sturm-Liouville forms of these equations may look more cumber- 
some than the original forms, we will see that they are very convenient for the 
analysis that follows. This is because of the self-adjoint nature of the differential 
operator. 


4.3.2 Boundary Conditions 

We begin with a couple of definitions. The endpoint, x = a, of the interval (a, b) 
is a regular endpoint if a is finite and the conditions (4.36) hold on the closed 
interval [a, c] for each c £ (a, 6). The endpoint x = a is a singular endpoint if 
a = —oo or if a is finite but the conditions (4.36) do not hold on the closed interval 
[a, c] for some c £ (a, b). Similar definitions hold for the other endpoint, x = b. For 
example, Fourier’s equation has regular endpoints if a and b are finite. Legendre’s 
equation has regular endpoints if— 1 < a < 6 < 1, but singular endpoints if a = —1 
or b = 1, since p(x) = 1 — x 2 = 0 when x = ±1. Bessel’s equation has regular 
endpoints for 0 < a < b < oo, but singular endpoints if a = 0 or b = oo, since 
q(x) = — v 2 lx is unbounded at x = 0. 

We can now define the types of boundary condition that can be applied to a 
Sturm-Liouville equation. 

(i) On a finite interval, [a, b], with regular endpoints, we prescribe unmixed, or 
separated, boundary conditions, of the form 

otQy(a, A) + aiy'(a, A) = 0, P 0 y(b, A) + Piy'(b, A) = 0. (4.38) 

These boundary conditions are said to be real if the constants ag, oq, Po 
and Pi are real, with ajj + af > 0 and Po + Pi >0. 

(ii) On an interval with one or two singular endpoints, the boundary conditions 
that arise in models of physical problems are usually boundedness condi- 
tions. In many problems, these are equivalent to Friedrichs boundary 
conditions, that for some c £ (a, b) there exists A £ R + such that 

|y(a:,A)| < A for all x £ (a, c], 

and similarly if the other endpoint, x = b, is singular, there exists B £ R + 
such that 


| y(x, A)| < B for all x £ [c, b). 

We can now define the Sturm-Liouville boundary value problem to be the 

Sturm-Liouville equation, 

(p(x)y'(x)y + q{x)y{ x) = -A r(x)y(x) for x £ ( a,b ), 

where the coefficient functions satisfy the conditions (4.36), to be solved subject to a 
separated boundary condition at each regular endpoint and a Friedrichs boundary 
condition at each singular endpoint. Note that this boundary value problem is 
homogeneous and therefore always has the trivial solution, y = 0. A nontrivial 
solution, y(x, A) ^ 0, is an eigenfunction, and A is the corresponding eigenvalue. 



110 


BOUNDARY VALUE PROBLEMS 


Some Examples of Sturm-Liouville Boundary Value Problems 
Consider Fourier’s equation, 

y"(x, A) = -A y(x, A) for x G (0, 1), 

subject to the boundary conditions y{ 0, A) = y{ 1, A) = 0, which are appropriate 
since both endpoints are regular. The eigenfunctions of this system are sin-y/A^a; 
for n = 1, 2, . . . , with corresponding eigenvalues A = A„ = n 2 7r 2 . 

Legendre’s equation is 

{(1 - x 2 )y'(x, A)}' = -Xy(x,X) for x G (-1, 1). 

Note that this is singular at both endpoints, since p(±l) = 0. We therefore apply 
Friedrichs boundary conditions, for example with c = 0, in the form 

\y(x, A)|<A for x G (-1,0], \y(x, A)| < B for x G [0,1), 

for some A, B G R + . In Chapter 2 we used the method of Frobenius to construct the 
solutions of Legendre’s equation, and we know that the only eigenfunctions bounded 
at both the endpoints are the Legendre polynomials, P n (x) for n = 0, 1, 2, . . . , with 
corresponding eigenvalues A = A„ = n{n + 1). 

Let’s now consider Bessel’s equation with v = 1, over the interval (0, 1), 

(xy')' - - = -A xy. 
x 

Because of the form of q(x), x = 0 is a singular endpoint, whilst x = 1 is a regular 
endpoint. Suitable boundary conditions are therefore 

\y(x, A) | ^ A for x G (0, |] , 2/(1, A) = 0, 

for some A G K. + . In Chapter 3 we constructed the solutions of this equation using 
the method of Frobenius. The solution that is bounded at x = 0 is J\ (x\f X). The 
eigenvalues are solutions of 

Ji{\/Xn) = 0, 

which we write as A = A 2 , A 2 , ... , where J\(X n ) = 0. 

Finally, let’s examine Bessel’s equation with v = 1, but now for x G (0, oo). 
Since both endpoints are now singular, appropriate boundary conditions are 

\y(x, A)| < A for a; G (0, |], \y(x, A)| ^ B for x G [|, oo), 

for some A, B G R + . The eigenfunctions are again J\ (xV A) , but now the eigenval- 
ues lie on the half-line [0,oo). In other words, the eigenfunctions exist for all real, 
positive A. The set of eigenvalues for a Sturm-Liouville system is often called the 
spectrum. In the first of the Bessel function examples above, we have a discrete 
spectrum, whereas for the second there is a continuous spectrum. We will 
focus our attention on problems that have a discrete spectrum only. 
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4.3.3 Properties of the Eigenvalues and Eigenfunctions 

In order to further study the properties of the eigenfunctions and eigenvalues, 
we begin by defining the inner product of two complex-valued functions over an 
interval I to be 

(4>i(x),<h(x)) = J <l>l(x)(j) 2 (x) dx, 

where a superscript asterisk denotes the complex conjugate. This means that the 
inner product has the properties 

(i) (0i,0 2 ) = (02, 0i)*, 

(ii) (ai 0i, a 2 02) = a\ a 2 (0i, 02), 

(iii) (01, 02 + 0 3 ) = (01, 0 2 ) + (01, 03), (01 + 02, 03) = (01, 03) + (02, 03), 

(iv) (0,0) = fj 1 0| 2 dx ^ 0, with equality if and only if <f>(x) = 0 in I. 

Note that this reduces to the definition of a real inner product if 0i and 02 are real. 

If (0i, 0 2 ) = 0 with 0i ^ 0 and 0 2 ^ 0, we say that 0i and 02 are orthogonal. 

Let 2/1 (x), 2 / 2 ( 0 ;) £ C 2 [a, 6] be twice-differentiable complex- valued functions. By 
integrating by parts, it is straightforward to show that (see Lemma 4.1) 

( 2 / 2 , 52 / 1 ) - (52/2, yi) = /^{yiOLKz/'KaO)' - 2/'i 00)2/2 00)} q , (4-39) 

which is known as Green’s formula. The inner products are defined over a sub- 
interval [cr, /?] C ( a,b ), so that we can take the limits a — > a + and (3 — > b~ when 
the endpoints are singular, and the Sturm-Liouville operator, <S, is given by (4.35). 

Now if x = a is a regular endpoint and the functions z/i and y 2 satisfy a separated 
boundary condition at a, then 

P{a){yi{a){y* 2 {a))' - 2/i(a)2/|(a)} = 0. (4.40) 

If a is a finite singular endpoint and the functions y\ and y 2 satisfy the Friedrichs 
boundary condition at a, 

lim \p(x)\y 1 (x)(y2{x)Y - y[(x)y2(x)\] =0. (4.41) 

x—*a+ L k J J 

Similar results hold at x = b. 

We can now derive several results concerning the eigenvalues and eigenfunctions 
of a Sturm-Liouville boundary value problem. 

Theorem 4.3 The eigenvalues of a Sturm-Liouville boundary value problem are 
real. 

Proof If we substitute y\{x) = y(x,X) and y 2 (x) = y*{x, A) into Green’s formula 
over the entire interval, [a, 6], we have 

(y*(x, X),Sy(x, A)) - (Sy*(x,X),y(x, A)) 

= p(x){y{x,\)(y*(x,\))' -y'(x,X)y*(x, A)} ^ = 0, 
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making use of (4.40) and (4.41). Now, using the fact that the functions y(x, A) and 
y*(x, A) are solutions of (4.33) and its complex conjugate, we find that 

pb pb 


/ r(x)y(x , A)y*(x, A)(A — A*) dx = (A — A*) 


r(x)\y(x, A)| 2 dx = 0. 


Since r(x) > 0 and y(x, A) is nontrivial, we must have A = A*, and hence A € 1. 

□ 


Theorem 4.4 If y(x, A) and y(x, A) are eigenfunctions of the Sturm-Liouville 
boundary value problem, with A ^ A, then these eigenfunctions are orthogonal over 
( a,b ) with respect to the weighting function r( x), so that 


r(x)y(x, X)y(x, A) dx = 0. 


(4.42) 


Proof Firstly, notice that the separated boundary condition, (4.38), at x = a takes 
the form 

«o2/i(a) + <*iyi (a) = 0, a 0 y 2 (a) + aq y' 2 {a) = 0. (4.43) 

Taking the complex conjugate of the second of these gives 

ccoJ/2 ( a ) + a i ( 2 / 2 ( 0 ))* = 0, (4-44) 

since ao and oq are real. For the pair of equations (4.43)2 and (4.44) to have a 
nontrivial solution, we need 

2/i (a) (2/2(0))* - 2/1 (0)2/2 (o) = 0. 

A similar result holds at the other endpoint, x = b. This clearly shows that 
P(x){y{x,\) (y'{x,\f) - y'{x,X) (y(x,X)j j -* 0 
as x — > a and x — » b, so that, from Green’s formula, (4.39), 

(y(x,X),Sy(x,X)) = {Sy(x, X),y(x, A)). 

If we evaluate this formula, we find that 

/ r(x)y(x, X)y(x, A) dx = 0, 

J a 

so that the eigenfunctions associated with the distinct eigenvalues A and A are 
orthogonal with respect to the weighting function r(x). □ 


Example 

Consider Hermite’s equation, (4.20). By using the method of Frobenius, we can 
show that there are polynomial solutions, H n {x), when A = 2 n for n = 0,1,2,.... 
For example, H 0 (x) = 1, H\(x) = 2x and H 2 (x) = Ax 2 — 2. The solutions of (4.21), 
the self-adjoint form of the equation, that are bounded at infinity for A = 2n then 
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take the form u n = e ^ 2 H n (x), and, from Theorem 4.4, satisfy the orthogonality 
condition 


e x H n (x)H m (x) dx 


0 for n ^ m. 


4.3.4 Bessel’s Inequality, Approximation in the Mean and 
Completeness 

We can now define a sequence of orthonormal eigenfunctions 


<j>n(x ) 


Vr(x}y(x, A n ) 

(V r ( x )y( x > V r ( x )y( x > A «)) 1 


which satisfy 


(4*n(x)i — d nrn , 


(4.45) 


where 6 nm is the Kronecker delta. We will try to establish when we can write a 
piecewise continuous function f(x) in the form 

OO 

/o ) = ( 4 - 46 ) 

i=0 

Taking the inner product of both sides of this series with <pj ( x ) shows that 




(4.47) 


using the orthonormality condition, (4.45). The quantities a* are known as the ex- 
pansion coefficients, or generalized Fourier coefficients. In order to motivate 
the infinite series expansion (4.46), we start by approximating /( x) by a finite sum, 

N 

f N (x) = Aj(/)(x, Aj), 

2=0 

for some finite N, where the A, are to be determined so that this provides the most 
accurate approximation to f(x). The error in this approximation is 


N 


Rn(x) = f(x) - Ai(j)(x , A i). 


2—0 


We now try to minimize this error by minimizing its norm 

\\R N \\ 2 = (R N (x),R N (x)) = J |/(x) - ^2 Aj^»(x)| dx, 

which is the mean square error in the approximation. Now 

IN N \ 

ll^ll 2 = ( f(x) -'^2,A i <t>i{x),f{ x) -^2Ai<f>i(x) \ 


2—0 


2=0 
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= II /(*) H 2 


N 




i = 0 


N \ IN N 

22 Ai4>i{x),f{x) j + l'22 A i ( l ) i(x),'22A i (l) i (x) 

i — 0 / \ %— 0 i — 0 

We can now use the orthonormality of the eigenfunctions, (4.45), and the expression 
(4.47), which determines the coefficients a*, to obtain 

N 

\\ R N ( x )\\ 2 = \\f(x)\\ 2 -22 A i(f( x )^i( x )) 

i = 0 

N N 

- 22 A i (M x )’f( x )) + 22 A *i Al (M x )Ai( x )) 

i — 0 i — 0 




N 


ll/( x )l| 2 + Ai a i — A* a* + A*Ai} 


i = 0 


N 

= ii/wii 2 + i\ Ai - a *i 2 ~ n 2 } ■ 

2=0 

The error is therefore smallest when Ai = ai for i = 0, 1,... ,N, so the most 
accurate approximation is formed by simply truncating the series (4.46) after N 
terms. In addition, since the norm of Rn(x) is positive, 


N r b 

22 H 2 ^ / I f{ x )? dx - 

i= 0 Ja 


As the right hand side of this is independent of N, it follows that 


OO r.b 

22 m 2 ^ / i/wi 2 ^ 

2=0 Ja 


(4.48) 


which is Bessel’s inequality. This shows that the sum of the squares of the 
expansion coefficients converges. Approximations by the method of least squares 
are often referred to as approximations in the mean, because of the way the error 
is minimized. 

If, for a given orthonormal system, (f>i{x), (f> 2 (x ), . . . , any piecewise continuous 
function can be approximated in the mean to any desired degree of accuracy by 
choosing N large enough, then the orthonormal system is said to be complete. For 
complete orthonormal systems, Rn{x) — » 0 as N — > oo, so that Bessel’s inequality 
becomes an equality , 



\f(x)\ 2 dx, 


(4.49) 


for every function f(x) 
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The completeness of orthonormal systems, as expressed by 


lim 

N — >-oo 



dx = 0, 


does not necessarily imply that f{x) = * n other words that fix) 

has an expansion in terms of the (j>i{x). If, however, the series YJaLo a i ( t > i( x ) is 
uniformly convergent, then the limit and the integral can be interchanged, the 
expansion is valid, and we say that ai<j>i{x) converges in the mean to fix). 

The completeness of the system (j>\ (x) ,<fi 2 (x),... , should be seen as a necessary 
condition for the validity of the expansion, but, for an arbitrary function f(x), the 
question of convergence requires a more detailed investigation. 

The Legendre polynomials Poix), Pi(x), ■ ■ ■ on the interval [—1, 1] and the Bessel 
functions J„(Ai:r), J„( X 2 x), ... on the interval [0, a] are both examples of complete 
orthogonal systems (they can easily be made orthonormal), and the expansions 
of Chapters 2 and 3 are special cases of the more general results of this chapter. 
For example, the Bessel functions J U (V Ax) satisfy the Sturm-Liouville equation, 
(4.33), with p(x) = x, q(x) = — v 2 fx and r(x) = x. They satisfy the orthogonality 
relation 

xJ„(^fix)J u {\/Ax) dx = 0 , 



if A and /j are distinct eigenvalues. Using the regular endpoint condition J„iy/Xa) = 
0 and the singular endpoint condition at x = 0, the eigenvalues, that is the zeros 
of J v {x), can be written as y/X a = X±a, X 2 a , ... , so that y/X = A,; for * = 1,2,... , 
and we can write 


with 


OO 

f{x) = Y,OiMXix), 

i=l 


a; = 


a2 {J'A Ajo)} Jo 


xJ„iXix)fix) dx, 


consistent with (3.32). 


4.3.5 Further Properties of Sturm-Liouville Systems 

We conclude this section by investigating some of the qualitative properties of 
solutions of the Sturm-Liouville system (4.33). In particular, we will establish 
that the n th eigenfunction has n zeros in the open interval (a, b) . We will take 
a geometrical point of view in order to establish this result, although we could 
have used an analytical framework. To achieve this, we introduce the Priifer 
substitution, 


pix)y'i'. r) = Rix) cos 0(x), y(x) = i?(x) sin 0(x). 


(4.50) 
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The new dependent variables, R and 9 , are then defined by 

R 2 = y 2 +P 2 (y') 2 , 9 = tan- 1 (^ 1 ) (4.51) 

\pyj 

and, by analogy with polar coordinates, we call R the amplitude and 9 the phase 
angle. Nontrivial eigenfunction solutions of the Sturm-Liouville equation have 
R > 0, since R = 0 at any point x — Xq would mean that y(x o) = y'(x o) = 0, and 
hence give the trivial solution for all x. 

If we now write cot 9(x) = py' /y and differentiate, we obtain 


—cosec 0— = 
ax 


d9 {py 1 ) 


/V 


p{y') 

y i 


= —A r — q — 


1 


cot 2 9, 


and hence 


^ = (gft) + A r(x)) sin 2 9 H — )*— 
ax p(x) 


cos 2 9. 


If we now differentiate (4.51)i, some simple manipulation gives 


dR 

dx 


p{x) 


— q( x) — A r(x) > R sin 29. 


(4.52) 


(4.53) 


The Priifer substitution has therefore changed our second order linear differential 
equation into a system of two first order nonlinear differential equations in R and 
9 over the interval a < x < b, (4.52) and (4.53). The equation for 0, (4.52), is 
however independent of R and is, as we shall see, relatively easy to analyze. 

If we now consider the separated boundary conditions 

«i y{a) + a 2 y'(a) = 0, ft y(b) + f3 2 y'(b) = 0, 
we can define two phase angles, 7 and 6, such that 

y{a) a 2 


tan 7 = 


p(a)y'(a) p(a)a 1 


for 0 < 7 < 7r, 


tan 6 = 


y{b) 


ft 


for 0 ^ 6 < 7r, 


p{b)y’{b) p(b)Pi 
and the eigenvalue problem that we have to solve is (4.52) subject to 
9(a) = 7, 9(b) = 6 + 7i7r for n = 0, 1, 2, . . . . 


(4.54) 


We need to add this multiple of nir because of the periodicity of the tangent func- 
tion. 

We can infer the qualitative form of the solution of (4.52) by drawing its di- 
rection field. This is the set of small line segments of slope d9/dx in the (9,x) 
plane, as sketched in Figure 4.3. Note that d9/dx = 1 /p(x) at 9 = mr , which is 
independent of A. In addition, d9/dx = q(x) + A p(x) at 9 = (n + |) n, which, for 
fixed x, increases with increasing A. From Figure 4.4, which shows some typical 
solution curves for various values of A, we can see that, for any initial condition 
9(a) = 7, 9(b) is an increasing function of A. As A increases from —00, there is a 
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first value, A = Ao, for which 9(b) = 6. As A increases further, there is a sequence 
of values for which 9 = 6 + nir for n = 1, 2, . . . . Each of these is associated with an 
eigenfunction, y n (x) = R n (x) sin 9(x, X n ). This has a zero when sin# = 0 and, in 
the interval 7 ^ 9 ^ b + mr, there are precisely n zeros at 9 = tt, 2n, . . . , rnr. 
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Fig. 4.3. The direction field (lines of slope d9/dx) for (4.52) when p(x) = x, q(x) = — 1/x 
and r(x) = x, which corresponds to Bessel’s equation of order one. 


Returning now to the example y" + Xy = 0 subject to y( 0) = y(ir) = 0, we have 
eigenvalues A n = n 2 and eigenfunctions sin nx for n = 1,2,.... Because of the way 
we have labelled these, Ao = 1 and yo(x) = sin a: is the zeroth eigenfunction, which 
has no zeros in 0 < x < n. We can also see that Ai = 2 2 and the first eigenfunction, 
Vi(x) = sin2a;, has one zero in 0 < x < 7 r, at x = and so on. We can formalize 
this analysis as a theorem. 


Theorem 4.5 A regular Sturm-Liouville system has an infinite sequence of real 
eigenvalues, Ao < Ai < ■ ■ ■ < A n < ■ ■ ■ , with A„ — > 00 as n — > 00. The correspond- 
ing eigenfunctions, y n (x), have n zeros in the interval a < x < b. 


We can prove another useful result if we add an additional constraint. If q(x) < 0, 
then all the eigenvalues are positive. To see why this should be so, consider the 
boundary value problem 


d_ 

dx 



(A r(x) + q(x)) y = 0 subject to y(a) = y(b) = 0. 
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Fig. 4.4. Typical solutions of (4.52) for various values of A when p(x) = x, q(x) = — 1/x 
and r(*) = x, which corresponds to Bessel’s equation of order one. 


If we multiply through by y and integrate over [a, b], we obtain 

A J r(x)y (i) 2 dx = J —q(x)y 2 dx — J y^j- (p( x )^j dx 
= J { ~q(x)y 2 + p(x) (y'f | dx, 

using integration by parts. This shows that 

A = / {~q(x)y 2 +p(x)(y'f} dx j J r(x)y 2 dx , 
which is positive when p and r are positive and q is negative. 


4.3.6 Two Examples from Quantum Mechanics 

One of the areas of mathematical physics where Sturm-Liouville equations arise 
most often is quantum mechanics, the theory that governs the behaviour of mat- 
ter on very small length scales. Three important postulates of quantum mechanics 
are (see Schiff, 1968): 

(i) A system is completely specified by a state function, or wave function, 

ip(r, t), with (ijj,ip) = 1. For example, if the system consists of a particle 
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moving in an external potential, then |^>(r, t)| 2 d 3 r is the probability of find- 
ing the particle in a small volume d 3 r that surrounds the point r, and hence 
we need 



\ip(r)\ 2 d 3 r = {ip, ip) = 1. 


(ii) For every system there exists a certain Hermitian operator, H , called the 

Hamiltonian operator, such that 


“w = H *' 


where 2n H « 6.62 x 10 -34 J s _1 is Planck’s constant. 

(iii) To each observable property of the system there corresponds a linear, Her- 
mitian operator, A, and any measurement of the property gives one of the 
eigenvalues of A. For example, the operators that correspond to momentum 
and energy are —iHV and ihd/dt. 


For a single particle of mass m moving in a potential field V(r,t), the classical (as 
opposed to quantum mechanical) total energy is 

E = p 2 /2m + V{r,t), (4.55) 


where p is the momentum of the particle. This is just the sum of the kinetic 
and potential energies. To obtain the quantum mechanical analogue of this, we 
quantize the classical result (4.55) by substituting the appropriate operators for 
momentum and energy, and arrive at 

ih— = - — V 2 ip + V(r,t)ip. (4.56) 

This is Schrodinger’s equation, which governs the evolution of the wave function. 

Let’s look for a separable solution of (4.56) when V is independent of time. We 
write ip = u(r)T(t), and find that 


T h 2 

T 2 mu 


V 2 w + V(r) = E, 


where E is the separation constant. Since iHT' = ET, 

ih d f t = 


and hence E is the energy of the particle. The equation for u is then the time- 
independent Schrodinger equation, 

h 2 

- — V 2 u + ( V(r) — E) u = 0. (4.57) 

2m 

We seek solutions of this equation subject to the conditions that u should be finite, 
and u and Vu continuous throughout the domain of solution. After we impose 
appropriate boundary conditions, we will find that the energy of the system is not 
arbitrary, as classical physics would suggest. Instead, the energy must be one of 
the eigenvalues of the boundary value problem associated with (4.57), and hence 
the eigenvalues are of great interest. The energy is said to be quantized. 
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Example: A confined particle 

Let’s consider the one-dimensional problem of a particle of mass m confined in a 
region of zero potential by an infinite potential at x = 0 and x = a. What energies 
can the particle have? 

Since the probability of finding the particle outside the region 0 < x < a is zero, 
we must have ip = 0 there. By continuity, we therefore have ip = 0 at x = 0 and 
x = a. We must therefore solve the eigenvalue problem 


h 2 d 2 u 

2 m dx 2 U 


(4.58) 


subject to u = 0 at x = 0 and x = a. However, this is precisely the problem, given 
by (4.2), with which we began this chapter, with A = 2mE/K 2 . We conclude that 
the allowed energies of the particle are E = E n = h 2 X n /2m = h 2 n 2 Tr 2 /2ma 2 . 


Example: The hydrogen atom 

The hydrogen atom consists of an electron and a proton. Since the mass of the 
proton is much larger than that of the electron, let’s assume that the proton is 
at rest. The steady state wave function, ip(r), then satisfies (4.57) which, after 
rescaling r and E to eliminate the constants, we write as 


V 2 ip - 2 Vip = -2 Eip for |r| > 0. (4.59) 

The potential H(r) = V(r) = —q 2 /r is the Coulomb potential due to the electri- 
cal attraction between the proton and the electron, where — q is the charge on the 
electron and q that on the proton. 

We can now look for separable solutions in spherical polar coordinates (r, 6 , (p) 
in the form ip = R{r)Y (s)Q(<p) , where s = cosd. Substituting this into (4.59) gives 
us 


r 2 ^— + 2r^- + 2 r 2 (E — V{r)) 

R R 


{(l-s 2 )W}' 1 0" 

Y l-s 2 ~0~ 


for some separation constant A. If we take 0"/0 = — m 2 with to = 1,2,... for 
periodicity, we obtain 0 = A m cos m<p + B m sin imp. If A = n(n + 1), with n ^ to, 
then the equation for Y is the associated Legendre equation, which we studied in 
Chapter 2. The bounded solution of this is Y = C n Pff L {s ), and we have a solution 
in the form 


ip{r,0,cp ) = R(r)P™(s) (D n , m cos mcp + E n ^ m sin m<p ) , 

where D 7 i m and E n rn are arbitrary constants and R satisfies the differential equa- 
tion 


r 2 R" + 2rR — n(n + 1 )R + 2 r 2 



R = 0. 


We can simplify matters by defining S = rR , so that 

S " - n ^ + ^ S + 2 (e + S = 0 for r > 0, (4.60) 
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to be solved subject to the condition that S/r — > 0 as r — > oo. This is an eigenvalue 
problem for E, the allowable energy levels of the electron in the hydrogen atom. 

Some thought and experimentation leads one to try a solution in the form S = 
r n+1 e~ Mr , which will satisfy (4.60) provided that y = q 2 /(n + 1) and E = — |/i 2 . 
A reduction of order argument then shows that there are no other solutions with 
E < 0 that satisfy the condition at infinity. Each energy level, E n = —q i /2N 2 for 
iV = 1,2,... , corresponds to a possible steady state for our model of the hydrogen 
atom, and is in agreement with the experimentally observed values. However, this 
is not the end of the matter, as it is possible to show that every positive value of 
E corresponds to a bounded, nontrivial solution of (4.60). In other words, there is 
both a continuous and a discrete part to the spectrum. These states with positive 
energy can be shown to be unstable, as the electron has too much energy. 


Exercises 


4.1 Use the eigenfunction expansion method to find a solution of the boundary 
value problem y"(x) = —h(x) for 0 < x < 2n, subject to the boundary 
conditions y(0) = y( 2n), y'{ 0) = y'{ 27t), with h € C[0,27t]. 

4.2 Find the Green’s function for the boundary value problems 

(a) y”(x) = f(x) subject to y{- 1 ) = y( 1) = 0, 

(b) y"(x) + u 2 y(x) = /( x) subject to y( 0 ) = y( tt/2) = 0. 

4.3 Comment on the difficulties that you face when trying to construct the 
Green’s function for the boundary value problem 

y"{x) + y(x) = f(x) subject to y(a) = y'(b) = 0. 


4.4 


4.5 


4.6 


4.7 


Show that when y" + Xy = 0 for 0 < x < 7r, the eigenvalues when (a) 
2/(°) = J/'W = 0, (b) y'(0) = y( tt) = 0, (c) y'{ 0) = y'{ tt) = 0, are 
(n+ 5) , and n 2 respectively, where n = 0,1,2,... . What are 

the corresponding eigenfunctions? 

Show that the equation 

4-f + ~ C(x)}y = 0 

ax z ax 

can be written in Sturm-Liouville form by defining p(x) = exp (f A(x ) dec). 
What are q(x ) and r(x) in terms of A, B and Cl 
Write the generalized Legendre equation, 


o d V 


i(n + 1) 



y = 0 , 


as a Sturm-Liouville equation. 

Determine the eigenvalues, A, of the fourth order equation yA) + Xy = 0 
subject to ?/(0) = y'{ 0) = y( n) = y'(n) = 0 for 0 < x < 7 r. 
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4.8 Consider the singular Sturm-Liouville system 

(; xe~ x y + A e~ x y = 0 for x > 0, 

with y bounded as x — > 0 and e~ x y — ■> 0 as x — > oo. Show that when 
A = 0,1,2,... there are polynomial eigenfunctions. These are known as 
the Laguerre polynomials. 

4.9 Show that the boundary value problem 

y"(x) + A(x)y'(x) + B{x)y{x) = C(x) for all x € (a, 6), 
subject to 

on y(a) + f3iy{b) + any 1 (a) + Piy\b) = 71 
a 2 y(a) + fhy(b) + a 2 y'(a) + j3 2 y\b) = 72, 
is self-adjoint provided that 

(3i(3 2 — = (aiQ!2 

4.10 Show that 

-(xy'(x))' = A xy(x) 

is self-adjoint on the interval (0,1), with x = 0 a singular endpoint and 
i=la regular endpoint with the condition y{ 1) = 0. 

4.11 Find the eigenvalues of the system consisting of Fourier’s equation and the 
conditions y( 0) — y'(0) = 0 and y(n) = 0. Show that these eigenfunctions 
are orthogonal on the interval (0,7r). 

4.12 Prove that sinma; has at least one zero between each pair of consecutive 
zeros of sin nx, when m > n. 

4.13 * Using the Sturm comparison theorem (see Exercise 1.14), show that every 
solution of Airy’s equation, y"{x) — xy(x) = 0, vanishes infinitely often on 
the positive x-axis, and at most once on the negative a;-axis. 




CHAPTER FIVE 


Fourier Series and the Fourier Transform 


In order to motivate our discussion of Fourier series, we shall consider the solution of 
a diffusion problem. Let’s suppose that we have a long, thin, cylindrical metal bar 
of length L whose curved sides and one end are insulated from its surroundings. 
Suppose also that, initially, the temperature of the bar is T = To, but that the 
uninsulated end is suddenly cooled to a temperature Ti < T 0 . How does the 
temperature in the metal bar vary for t > 0? Physical intuition suggests that heat 
will flow out of the end of the bar, and that, as time progresses, the temperature 
will approach T\ throughout the bar. 

In order to quantify this, we note that the temperature in the bar satisfies the 
one-dimensional diffusion equation 

dT d 2 T 

S = for0<x<1 " (51) 

where x measures distance along the bar and K is its thermal diffusivity (see Sec- 
tion 2.6.1). The initial and boundary conditions are 


dT 

T{x, 0) = Tq, T(0,i)=Ti, —(L,t)=0. 


(5.2) 


Before we solve this initial- boundary value problem, it is convenient to define di- 
mensionless variables, 

*_ T~Ti ._x f Kt 
T 0 — TV X L' T 2 ' 

In terms of these variables, (5.1) and (5.2) become 

dT _ d 2 f 
dt dx 2 


— T = vyry for 0 < x < 1, 


(5.3) 


T(x, 0) = 1, T(0,t) = 0, —(l,t)=0. (5.4) 

As you can see, by choosing appropriate dimensionless variables, we have managed 
to eliminate all of the physical constants from the problem. 

The use of dimensionless variables has additional advantages, which makes it 
essential to use them in studying most mathematical models of real physical prob- 
lems. Consider the length of the metal bar in the diffusion problem that we are 
studying. It is usual to measure lengths in terms of metres, and, at first sight, it 
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seems reasonable to say that a metal bar with length 100 m is long, whilst a bar of 
length 10 -2 m = 1 cm is short. In effect, we choose a bar of length 1 m as a refer- 
ence relative to which we measure the length of our actual bar. Is this a reasonable 
thing to do? In fact, the only length that is defined in the problem is the length 
of the bar itself. This is the most sensible length with which to make the problem 
dimensionless, and leads to a bar that lies between x = 0 and x = 1. For any given 
problem, it is essential to choose a length scale that is relevant to the physics of 
the problem itself, not some basically arbitrary length, such as 1 m. For example, 
problems in celestial mechanics are often best made dimensionless using the mean 
distance from the Earth to the Sun, whilst problems in molecular dynamics may be 
made dimensionless using the mean atomic separation in a hydrogen atom. These 
length scales are enormously different from 1 m. The same argument applies to 
all of the dimensional, physical quantities that are used in a mathematical model. 
By dividing each variable by a suitable constant with the same dimensions, we can 
state that a variable is small or large in a meaningful way. For example, using the 
dimensionless time, t = Kt/L 2 , which we defined above, the solution when t < 1, 
the small time solution, really does represent the behaviour of the temperature in 
the metal bar when diffusion has had little effect on the initial state, over the length 
scale, L, that characterizes the full length of the bar. The other advantage of using 
dimensionless variables is that any physical constants that remain explicitly in the 
dimensionless problem appear in dimensionless groups, which can themselves be 
said to be large or small in a meaningful way. No dimensionless groups appear in 
the simple diffusion problem given by (5.3) and (5.4), and we will defer any further 
discussion of them until Chapter 11. 

We can now continue with our study of the diffusion of heat in a metal bar by 
seeking a separable solution, T(x,t) = X(x)F(t). On substituting this into (5.3) 
we obtain 

X" F , 2 , 

— - = — = — k , the separation constant. 

X F 

The equation F = —k 2 F has solution F = ae -fc2 *, whilst X" + k 2 X = 0, Fourier’s 
equation, (4.37), which we studied in Chapter 4, has solution 

X = A sin kx + B cos kx. 

The condition T(0,f) = 0 means that X(0) = 0, and hence that B = 0. Similarly, 
the condition that there should be no flux of heat through the bar at x = 1 leads 
to kAcos k = 0, and hence shows that 

7 r 

k = — + mr for n = 0, 1, 2, . . . . 

There is therefore a count ably-infinite sequence of solutions, 



T n = A n exp 
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Since this is a linear problem, the general solution is 


T(x,t) = ^2 A n exp j-7r 2 ^ n + ^ sin j 77 ^ £ j • 


(5.5) 


We are now left with the task of determining the constants A n . 

The only information we have not used is the initial condition, T(x, 0) = 1, which 
shows that 


n — 0 


A n sin \ 7 r ( n + - ) x > = 1. 


(5.6) 


We now multiply this expression through by sin (71 ( m + |) x} and integrate over 
the length of the bar. After noting that 

rl 


1 


sin <7 r ( n + - x > sin < 7r ( to + - x > dx 


1 


f 1 1 

= / - [cos {7r (to — n) x} — cos {71 (to + n + 1) x}] dx = 0 for m 7 ^ n, (5.7) 
Jo 2 


and 


J sin 2 ^ 7 r x | dx = J - [1 — cos {7 r ( 2 m + 1 ) x}] dx = - for m = n, 

(5.8) 

we conclude that 


A n = 2 / sin •( 7 r I n + ^ ) x ^ dx = 


(2 n + 1)7T ’ 


and hence that 


^ f ) = E ( 2 n ^ r ex p{-- 2 («+ 2 ) fj™ {*(»+*)*}■ (5.9) 

This is a Fourier series solution, and is shown in Figure 5.1 at various times. We 
produced Figure 5.1 by plotting the MATLAB function 


function heat = heat(x,t) 

acc = 1CT-8; n=ceil(sqrt(-log(acc)/t)-0.5) ; 

N = 0:n; a = 4*exp(-pi~2*(N+0.5) . ~2*t)/pi . /(2*N+1) ; 
for k = 0 :n 

X(:,k+1) = sin(pi* (k+0 . 5) *x( : ) ) ; 

end 

heat = X*a J ; 


This adds enough terms of the series that exp{— 7 r 2 (n + \) 2 t} is less than some 
small number (acc = 1CD 8 ) in the final term. Note that the function ceil rounds 
its argument upwards to the nearest integer (the ceiling function). 

It is clear that T — > 0, and hence T — > Tf, as t — > 00 , as expected, and that the 
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Fig. 5.1. The solution of the initial boundary value problem, (5.3) and (5.4), given by 
(5.9), at various times. 


heat flows out of the cool end of the bar, x = 0. It is precisely problems of this sort, 
which were first solved by Fourier himself in the 1820s, that led to the development 
of the theory of Fourier series. Note also that (5.9) with t = 0 and x = 1/2 and 1 
leads to the interesting results 

11111 _7T ^^11111 7T 

1- 3 + 5 - 7 + 9~TT + "‘~4’ + 3 _ 5~7 + 9 + TT ~ 2^1' 

That the basis functions are orthogonal, as given by (5.7) and (5.8), should 
come as no surprise, since Fourier’s equation is a Sturm-Liouville equation, and 
the Fourier series is just another example of a series expansion in terms of an or- 
thogonal basis, similar to the Fourier-Legenclre and Fourier-Bessel series. We have 
developed the Fourier series solution of this problem without too much attention to 
mathematical rigour. However, a number of questions arise. Under what conditions 
does a series of the form 


n — 0 


^2 A n sin 7T ^ n + - | x 
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converge, and is this convergence uniform? Does every function have a convergent 
Fourier series representation? These questions can also be asked of expansions in 
terms of other sequences of orthogonal functions, such as the Fourier-Legendre and 
Fourier-Bessel series. The technical details are, however, rather more straightfor- 
ward for the Fourier series. 


5.1 General Fourier Series 


The most general form for a Fourier series representation of a function, /(£), with 
period T is 


m = 


oo 

E 

n— 1 


A„ COS 


27T7lf \ 

T J 


+ B n sin 


2-7T nt\ 

T ) 


(5.10) 


and, using the method described above, the Fourier coefficients, A n and B n , are 
given by 


A 


n 



/(f) dt , B, 



f(t ) dt. 


(5.11) 


Note that if / is an odd function of t, A n = 0, since cos(27 vnt/T) is an even function 
of t. This means that the resulting Fourier series is a sum of just the odd functions 
sin(27r nt/T), a Fourier sine series. Similarly, if / is an even function of t, the 
resulting expansion is a Fourier cosine series. Equations (5.10) and (5.11) can 
also be written in a more compact, complex form as 


OO 

/(f) = ^ C„e 2 ”‘/ T , (5.12) 

n =— oo 


with complex Fourier coefficients 

C n =l (A n - iB n ) = i [ 2 T e- 2 * int / T f(t) dt. (5.13) 

2 1 .1 ‘r 

As we have seen, Fourier series can arise as solutions of differential equations, 
but they are also useful for representing periodic functions in general. For example, 
consider the function of period 27r, defined by 


/(f) = f for — 7T < f < 7T, /(f) = /(£ ± 2nir) for n = 1, 2, . . . , 
which is piecewise continuous. Using (5.10) and (5.11), we conclude that 

oo _ 

/w = E-(- 1 )" +lsinnt = 2 

71 

n — 1 

The partial sums of this series, 

/w(f) = E -(- 1 )"' +1 sin nt, 
n 

n=l 


sin t — i sin 2t + ^ sin 3 1 — 
£ o 


(5.14) 
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are shown in Figure 5.2 for various N. An inspection of this figure suggests that 
the series does indeed converge, but that the convergence is not uniform. At the 
points of discontinuity, t = ±7r, (remember that f{t) has period 27r, and therefore 
/ — > 7r as t — > tt~, but / — > — 7r as t — > 7T+) the series (5.14) gives the value 

zero, the mean of the two limits as t approaches n from above and below. In the 
neighbourhood of t = ±7 r, the difference between the partial sums, /at, and f(t) 
appears not to become smaller as N increases, but the size of the region where 
this occurs decreases - a sure sign of nonuniform convergence. This nonuniform, 
oscillatory behaviour close to discontinuities is known as Gibbs’ phenomenon, 
which we also met briefly in Chapter 3. 


N=5 



t 


N=15 



t 


N=25 



t 


N=35 



t 


Fig. 5.2. The partial sums of the Fourier series for t, (5.14), for N = 5, 15, 25 and 35. 
The function being approximated is shown as a dotted line. 


Before we proceed, we need to construct the Dirichlet kernel and prove the 
Riemann— Lebesgue lemma, both of which are crucial in what follows. 


Lemma 5.1 (The Dirichlet kernel) Let f be a bounded function of period T 
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with f £ PC[a,b\ f, and let 

n 

S„(f,t)= J2 C m e“ /T (5.15) 

m=—n 


be the n th partial sum of the Fourier series of f(t), with C m given by (5.13). If 
we define the Dirichlet kernel, D n (t), to be a function of period T given for 
— \T<t<\Tby 


2 n + 1 when t = 0, 
which is illustrated in Figure 5.3 for T = 27r, then 

S n {f,t) = ^ f f (r) D n (t - t) dr. 

T J-\T 



(5.16) 


(5.17) 


n=10 




-2 0 2 
t 

n=30 


|^\/V\ A/wwwwv 


-2 0 2 
t 


n=20 




-2 0 2 
t 

n=40 

✓vaaaaaaaaaAAAA/VV\|^ 
. 

||j\/\/\AA/Wwwww^ 


-2 0 2 
t 


Fig. 5.3. The Dirichlet kernel, D n for n = 10, 20, 30 and 40 and T = 2n. 


f See Appendix 2. 
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Proof From (5.13) and (5.15), 






/(r)e- 2 ™ nT/T dr 


g 2nint/T 



0 2ivin(t—T) /T 


dr=- 


f(r)D n (t - t) dr , 


-kT 


where 

n 

D n (t)= Y e 2Tint/T . 

m=—n 


Clearly D n ( 0) = 2n+ 1. When t ^ 0, putting m = r — n gives the simple geometric 
progression 

_ 2rI _47ri(n+l/2)i/T _ i o,',-, 

P) ^—2nint/T \ ^ ^2irirt/T „—2nint/T ( " 1 b 

- e / - e e 27T2t/T _ ! 

r— 0 

and the result is proved. □ 

Lemma 5.1 tells us that the n th partial sum of a Fourier series can be written as 
(5.17), a simple integral that involves the underlying function and the Dirichlet 
kernelf. As we can see in Figure 5.3, as n increases, the Dirichlet kernel becomes 
more and more concentrated around the origin. What we need to show is that, 
as n — > oo, the Dirichlet kernel just picks out the value /(r) at t = t in (5.17), 
and hence that S n (f,t ) — > f(t) as n — > oo. As we will see in the next section, the 
sequence {D n (t}} is closely related to the delta function. 



Lemma 5.2 (The Riemann— Lebesgue lemma) If f : [a, 6] — > R, f £ PC[a , b] 
and f is bounded, then 

pb nb 

lim / f (t) sin Xtdt = lim / f(t) cos At dt = 0. 

A ^°° Ja X ^°° Ja 

Proof This seems intuitively obvious, since, as A increases, the period of oscilla- 
tion of the trigonometric function becomes smaller, and the contributions from the 
positive and negative parts of the integrand cancel out. 

Let [c, d] be a subinterval of [a, b } on which / is continuous. Define 

/(A) = f f(t)sin\tdt. (5.18) 

J C 

The argument for f(t) cos A t is identical. By making the substitution t = t + n/X, 
we obtain 

r d ~*/ x / 7r \ 

/(A) = — / / ( r + — ) sin Ar dr. (5.19) 

Jc-i r/A V x) 


f As we shall see, this is actually a convolution integral. 
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Adding (5.18) and (5.19) gives 

nc rd 

2J(A) = — / f ft + y ) sin At dt + / /(t)sinAtdt 
+ ^ {f(t) - f (t+ ^)} sin At eft. 

Let A' be the maximum value of |/| on [c, d], which we know exists by Theorem A2.1. 
If we also assume that A is large enough that 7r/A ^ d — c, then, remembering that 
| sin At | < 1, 

|/(A)| ^ + \ [ d n/x \ m - f (* + D I dt (5 - 20) 

Now, since / is continuous on the closed interval [c, d\, it is also uniformly continuous 
there and, given any constant e, we can find a constant Ao such that 

m ~ f ( t+ l)\ < d-c-n/X VA>A °- 

Since we can also choose Ao so that Ktt/X < e/2 V A > Ao, (5.20) shows that 
\I(X)\ < e, and hence that /(A) — > 0 as A — » oo. Applying this result to all of the 
subintervals of [a, b] on which / is continuous completes the proof. □ 

Theorem 5.1 (The Fourier theorem) If f and f are bounded functions of 
period T with f, f £ PC[a,b ], then the right hand side of (5.12), with C n given by 
(5.13), converges pointwise to 

\ ( lim f(r) + lim /(r) ) for -\T <t< \T, 

Z ^ T — >t~ T — >t~'~ J 


lim /(t) + lim /(r) > for t = — \T or \T . 


Note that, at points where f(t) is continuous, (5.10) converges pointwise to f(t). 


Proof At any point t = to € (—\T,'kT)i 


f ->■ /- as t -> t 0 , / -> /+ as t — > *□", 


since / is piecewise continuous, with /_ = /+ if / is continuous at t = to- Similarly, 
since f is piecewise continuous, / has well-defined left and right derivatives 

f , u ^ r f--f(to~h) ^ f(t 0 + h)-f+ 

/_ (t 0 ) = hm , /+ (t 0 ) = lim , , 

h — >0 h h—> o h 

with //(to) = //(to) if /' is continuous at t = t 0 . By the mean value theorem 
(Theorem A2.2), for h small enough that / is continuous when to — h ^ t < to, 


/_ — /(to — h) = f'(c)h for some t 0 — h < c < t 0 . 
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Since /' is bounded, there exists some M such that 

I/- - /(to -h ) | < 

After using a similar argument for t > to, we arrive at 

I/- - /(to - ft) | + | /(to + ft) - /+| < Mh (5-21) 

for all h > 0 such that / is continuous when t 0 — h < t 0 < f 0 + h. 

Now, in (5.17) we make the change of variable r = t + t' and, using the facts 
that D n (t) = D n (—t), and / and have period T, we find that 

Sn{f,t) = ^ j f(t + T')D(T') dr' 


— T 

= ^ J ^{f{t + r') + f{t-T')}D{T')dT', 

and hence that 

1 1 f 2 ^ r TTT^ 1 

Sn(f, t) --(/- + /+) = - J 1 T 9(ty) sin | (2n + 1) — | dr , (5.22) 

where 

/(t + T ') - /+ + f(t - t') - /_ 

5(t ’ T } = 2sin(^rVT) ' 

Considered as a function of r', g{t, t') is bounded and piecewise continuous, except 

possibly at r' = 0. However, for sufficiently small r', (5.21) shows that 


1 9 {t,r')\ < 


M\t'\ 

2 1 sin(7rr , /T)| ’ 


which is bounded as r' — > 0. The function g therefore satisfies the conditions of the 
Riemann-Lebesgue lemma, and we conclude from (5.22) that S n — > |(/_ + /+) as 
n — > oo and the result is proved. By exploiting the periodicity of /, the same proof 
works for t = —\T or with minor modifications. □ 


Theorem 5.2 (Uniform convergence of Fourier series) For a function f(t) 
that satisfies the conditions of Theorem 5.1, in any closed subinterval of [— |T, \T\, 
the right hand side of (5.12), with C n given by (5.13), converges uniformly to f(t) 
if and only if f(t) is continuous there. 

We will not give a proof, but note that this is consistent with our earlier discussion 
of Gibbs’ phenomenon. 


Finally, returning to the question of whether all bounded, piecewise continuous, 
periodic functions have a convergent Fourier series expansion, we note that this 
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difficult question was not finally resolved until 1964, when Carleson showed that 
such functions exist whose Fourier series expansions diverge at a finite or countably- 
infinite set of points (for example, the rational numbers). Of course, as we have 
seen in Theorem 5.1, these functions cannot possess a bounded piecewise continuous 
derivative, even though they are continuous, and are clearly rather peculiar (see, for 
example, Figure A2.1). However, such functions do arise in the theory of Brownian 
motion and stochastic processes, which are widely applicable to real world problems, 
for example in models of financial derivatives. For further details, see Korner (1988), 
and references therein. 


5.2 The Fourier Transform 


In order to motivate the definition of the Fourier transform, consider (5.12) and 
(5.13), which define the complex form of the Fourier series expansion of a periodic 
function, f(t). What happens as the period, T, tends to infinity? Combining (5.12) 
and (5.13) gives 


m 


. oo 

\ ' v 2n int/T 

j 1 z_^ 

n =— oo 



(T^nt'/T f(t') dt'. 


If we now let k n = 27T n/T and A k n = k n — k n -\ = 2n/T, we have 


m 


i 

27 r 


Y, M n e ik ^ 



e -iknt' f( t f) dt > 


If we now momentarily abandon mathematical rigour, as T — > oo, k n becomes a 
continuous variable and the summation becomes an integral, so that we obtain 

-| /*oo /*oo 

m= — e ikt e- ikt 'f(i?)dt'dk, (5.23) 

J — oo J — oo 

which is known as the Fourier integral. We will prove this result rigorously later. 
If we now define the Fourier transform of /(f) as 

/ OO 

e m f(t)dt, (5.24) 

-OO 

we immediately have an inversion formula, 

1 r°° 

f (t) = ^J_ e-* fct /(*0 dk. (5.25) 

Note that some texts alter the definition of the Fourier transform (and its inverse) to 
take account of the factor of 1 /2tt in a different way, for example with the transform 
and its inverse each multiplied by a factor of 1 /\/2n. This is a minor, but irritating, 
detail, which does not affect the basic ideas. 

The Fourier transform maps a function of t to a function of k. In the same 
way as the Fourier series expansion of a periodic function decomposes the function 
into its constituent harmonic parts, the Fourier transform produces a function of 
a continuous variable whose value indicates the frequency content of the original 
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function. This has led to the widespread use of the Fourier transform to analyze the 
form of time- varying signals, for example in electrical engineering and seismology. 
We will be more concerned here with the use of the Fourier transform to solve 
partial differential equations. 

Before we can proceed, we need to give (5.24) and (5.25) a firmer mathematical 
basis. What restrictions must we place on the function /(f) for its Fourier trans- 
form to exist? This is a difficult question, and our approach is to treat /(f) as a 
generalized function. The advantage of this is that every generalized function 
has a Fourier transform and an inverse Fourier transform, and that the ordinary 
functions in whose Fourier transforms we are interested form a subset of the gen- 
eralized functions. We will not go into great detail, for which the reader is referred 
to Lighthill (1958), the classic, and very accessible, introduction to the subject. 


5.2.1 Generalized Functions 

We begin with some definitions. A good function, g(x), is a function in C' 00 (K.) 
that decays rapidly enough that g(x) and all of its derivatives tend to zero faster 
than |a;| _Ar as x — > ±oo for all N > 0. For example, e~ x and seclix are good 
functions. 

A sequence of good functions, {/„( a;)}, is said to be regular if, for any good 
function F(x), 

/ OO 

f n (x)F(x)dx (5.26) 

-OO 

exists. For example f n {x) = G(x)/n is a regular sequence for any good function 
G(x), with 

/ OO -j -OO 

f n (x)F( x) dx = lim — / G(x)F(x) dx = 0. 

-oo n ^°° n 


Two regular sequences of good functions are equivalent if, for any good function 
F(x), the limit (5.26) exists and is the same for each sequence. For example, the 
regular sequences { G(x)/n } and {G(x)/n 2 } are equivalent. 

A generalized function, f(x), is a regular sequence of good functions, and two 
generalized functions are equal if their defining sequences are equivalent. Gener- 
alized functions are therefore only defined in terms of their action on integrals of 
good functions, with 



f(x)F(x) dx 


lim / f n (x)F(x)dx, 


for any good function F(x). 

If f(x) is an ordinary function such that (1 + x 2 )~ N f(x) is integrable from — oo 
to oo for some N, then the generalized function f{x) equivalent to the ordinary 
function is defined as any sequence of good functions {f n {x)} such that, for any 
good function F(x), 


lim 


f n (x)F(x)dx= / f{x)F(x)dx 


— OO 


— OO 
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For example, the generalized function equivalent to zero can be represented by 
either of the sequences {G(x)/n} and {G(x)/n 2 }. 

The zero generalized function is very simple. Let’s consider some more useful 
generalized functions. 

The unit function, I(x), is defined so that for any good function F{x), 

/ OO roo 

I(x)F(x) dx = / F(x) dx. 

-oo J — OO 

A useful sequence of good functions that defines the unit function is {e - ® 2 / 4 ”}. 
The unit function is the generalized function equivalent to the ordinary function 
/ 0 ) = 1 - 

The Heaviside function, H(x), is defined so that for any good function F(x), 

/ oo roc 

H(x)F(x) dx= F( x) dx. 

-oo J 0 

The generalized function iJ( x) is equivalent to the ordinary step functionf 

H(x) = { ° fora;<0 ’ 

1 ’ \ 1 for x > 0. 

- The sign function, sgn(:r), is defined so that for any good function F{x), 

/ OO r OO r 0 

sgn(a;)-F(a;) dx = / F(x) dx — / F(x) dx. 

-oo JO J —oo 

Then sgn(a;) can be identified with the ordinary function 

sgn(x) = 

In fact, sgn(cc) = 2 F[(x) — I(x), since we can note that 

/ OO rO O r OO 

{2H(x) — I(x)}g(x) dx = 2 / F[(x)g(x)dx — / I(x)g(x) dx 

-oo J — oo J —oo 

rOO rOO rOO r 0 

= 2 / g(x) dx —I g(x) dx= g(x) dx g{ x) dx, 

JO J —oo Jo J —oo 

using the definition of the Heaviside and unit functions. This is just the definition 
of the function sgn(:r). 

The Dirac delta function, 6(x), is defined so that for any good function F(x), 


— 1 for x < 0, 
1 for x > 0. 


6(x)F(x) dx = F(0). 


J —OO 

No ordinary function can be equivalent to the delta function. We can see from 

f Since generalized functions are only defined through their action on integrals of good functions, 
the value of H at x = 0 does not have any significance in this context. 
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(5.17) and Theorem 5.1 that a sequence based on the Dirichlet kernel, f n (x) = 
D n (x)/2TT = sin {(2 n + 1) /27rsin (^$), satisfies 


m 


lim 

n—*oo 



fn(x)F(x) dx. 


As we can see in Figure 5.3, the function becomes more and more concentrated 
about the origin as n increases, effectively plucking the value of F(0) out of the 
integral. However, this only works on a finite domain. On an infinite domain, 
the sequence 


fn{x) = ( 


n 

7T ) 


AV' 


(5.27) 


which is illustrated in Figure 5.4 for various n, is a useful way of defining 6( x). 



Fig. 5.4. The sequence (5.27), which is equivalent to S(x). 


5.2.2 Derivatives of Generalized Functions 

Derivatives of generalized functions are defined by the derivative of any of the 
equivalent sequences of good functions. Since we can integrate by parts using any 
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member of the sequence, we can also, formally, do so for the equivalent generalized 
function, treating it as if it were zero at infinity. For example 



8'(x)F( x) dx 



6(x)F'(x) dx 


-no). 


This does not allow us to represent 8'(x) in terms of other generalized functions, 
but does show us how it acts in an integral, which is all we need to know. 

We can also show that H'( x) = 8(x), since 



H'(x)F(x) dx 



H(x)F'(x) dx 


— / F'(x) dx 


[F(x)]~ = F(0). 


Another useful result is 


f{x)8{x) = f(0)6(x). 


The proof is straightforward, since 



f(x)8(x)F(x) dx = /(0)F(0), 


and 


mS(x)F(x)dx= / 8(x)[f (0)F(x)]dx = /(0)F(0). 


We can define the modulus function in terms of the function sgn(x) through 
\x\ — a;sgn(;r). Now consider the derivative of the modulus function. Using the 
product rule, which is valid because it works for the equivalent sequence of good 
functions, 

^\x\ = ^ {xsgn(ai)} = x {sgn(a;)} + sgn(x)-^(x). 

We can now use the fact that sgn(a;) = 2 H(x) — I(x) to show that 

-j-\x\ = x-j- {2H(x) — I(x )} + sgn(a;) = 2x8(x) + sgn(x) = sgnfa;), 
dx dx 

since x8(x) = 0. 


5.2.3 Fourier Transforms of Generalized Functions 

As we have seen, the Fourier transform of a function f(x ) is defined as 

/ OO 

e lkx f{x ) dx. 

-OO 


For example, consider the Fourier transform of the function e 1*1 , which, using the 
definition, is 


T 



x Akx 


e e 


dx 



dx 


1 1 _ 2 

1 + ik — 1 + ik 1 + k 1 2 ' 


(5.28) 
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We would also like to know the Fourier transform of a constant, c. However, it 
is not clear whether 

fOO 

e 1 " 


m = c £ 


ikx dx 


is a well-defined integral. Instead we note that, treated as a generalized function, 
c = cl(x), and we can deal with the Fourier transform of an equivalent sequence 
instead, for example 


n 


0 — x 2 /4ni 


/ OO poo 2 

e ikx-x /An dx = ce -nk 2 / e -(x/2 Jn-ikjn) ^ 

-OO j — OO 


By writing z = x — 2ikn and deforming the contour of integration in the complex 
plane (see Exercise 5.10 for an alternative method), we find that 


-z 2 /An dz = 2TrcJ-e- nk 


T[ce~ x2/4n } = ce~ nk2 


using (3.5). Since {e x Z 4 ”} is a sequence equivalent to the unit function, and 
{ \f^e ~ nk ~ } is a sequence equivalent to the delta function, we conclude that 

T[c\ = 2nc6(k). 

Another useful result is that, if T[f( x)} = f(k), /F[f(ax )] = f(k/a)/a for a > 0. 
To prove this, note that from the definition 

/ OO 

e lkx f(a x) dx. 

-OO 

Making the change of variable y — ax gives 

1 C°° 

T[f(ax)} = - / e ik y/ a f(y) dy, 

which is equal to f(k/a)/a as required. As an example of how this result can be 
used, we know that ^[e - ^] = 2/(1 + fc 2 ), so 


?[' 


3 -a|xh _ 


1 2 


2 a 


a i _|_ (f;) 2 a 2 + k 2 
Finally, the Fourier transformation is clearly a linear transformation, so that 

T[af + (3g\ = aT[f\ + (3T\g\. (5.29) 

This means that linear combinations of functions can be transformed separately. 


5.2.4 The Inverse Fourier Transform 

If we can show that (5.25) holds for all good functions, it follows that it holds 
for all generalized functions. We begin with a useful lemma. 

Lemma 5.3 The Fourier transform of a good function is a good function. 
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Proof If f{x) is a good function, its Fourier transform clearly exists and is given 

by 

/ oo 

e ik *f(x)dx. 

-oo 

If we differentiate p times and integrate by parts N times we find that 

filiV roo jN 

*wLJ^ wlxPmi \ dx ' 

All derivatives of / therefore decay at least as fast as \k\~ N as |fc| — > oo for any 
N > 0, and hence / is a good function. □ 

Theorem 5.3 (The Fourier inversion theorem) If f{x) is a good function with 
Fourier transform 

/ OO 

e ikx f(x)dx, 

-OO 

then the inverse Fourier transform is given by 

1 r°° 

f{x) = 2nJ_ e ~ ikX f^ dk - 

Proof Firstly, we note that for e > 0, 

T \e~ ex2 fi-x)\ = f e ikx ~ ex2 1 f e~ ixt f{t)dt\ dx. 

J J— oo l J — OO ) 

Since / is a good function, we can exchange the order of integration and arrive at 

nO O nOO 

T e~ ex2 f(-x) = / fit) / e i{k - t)x ~ tx2 dxdt 

J J— OO J— OO 

= f fit)e J°° expj- dxdt. 

Now, by making the change of variable x = x + i{k — t)/2e 3 / 2 , checking that this 
change of contour is possible in the complex cc-plane, we find that 



This means that 

r[e -™' '/(-*)] e-^-b 2 / 4 e fit)dt. 
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In addition, 


so we can write 
1 


2tt 


-T 


i i Ftt r°° 

e~ ex2 f(-x ) - f{k) = -J- e -( fc - 4 ) 2 / 4e {m _ m} dt . 


2n \ e 


Since / is a good function, 


m - m 

t — k 


< max | f'(x) | , 


and hence 
1 


2n 


-T 


e~ ex f(-x) - f(k ) 


sS 


1 J^max\f'(x)\ f e (fc t)2/4£ |t - k\ dt 

V e J-oo 


27 TV e 


\ — max I /'(a;) I 4e [ e x \X\dX — >0 as e — > 0. 

V e *£« J-oo 


1 /7 r 

27T 


We conclude that 


/(fc) = 2^^ [ /( “ a:) 


1 

27T 


Akx 


' — oo */ — oo 


t f{t) dt. 


This is precisely the Fourier integral, (5.23), and hence the result is proved. □ 


5.2.5 Transforms of Derivatives and Convolutions 

Fourier transforms are an appropriate tool for solving differential equations on 
the unbounded domain — oo < x < oo. In order to proceed, we need to be able 
to find the Fourier transform of a derivative. For good functions, this can be done 
using integration by parts. We find that 

/ OO nO O 

f'(x)e lkx dx — —ik / f(x)e lkx dx = —iktF[f], 

-oo J — OO 

since the good function / must tend to zero as x — > ±oo. Since generalized func- 
tions are defined in terms of sequences of good functions, this result also holds for 
generalized functions. Similarly, the second derivative is 

x[f"(x)} = -k 2 m- 

We can also define the convolution of two functions / and g as 

/ OO 

f(y)g(x - y) dy, 

-OO 

which is a function of x only. Note that f * g = g * f. As we shall see below, the 
solutions of differential equations can often be written as a convolution because of 
the key property 


T[f*g] = T[f\T\g\- 
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To derive this, we write down the definition of the Fourier transform of f * g, 

/ OO 

f(y)g(x - y ) dy 

-oo 

Note that the factor e lkx is independent of y and so can be taken inside the inner 
integral to give 

/•OO /*oo 

F[f *g}= / e lkx f(y)g(x - y) dy dx. 

J x=—oo J y =— oo 

Since the limits of integration are independent of one another, we can exchange the 
order of integration so that 

/•OO /*oo 

Hf *g}= / e lkx f(y)g{x - y) dx dy. 

J y —— oo J X — — oo 

Now f(y) is independent of x, and can be extracted from the inner integral to give 
F[f * g\ = [ fid ) { f e lkx g{x - y) dxX dy. 

J y— — oo L X——00 J 

By making the transformation z = x — y (so that dz = dx) in the inner integral, 
we have 

f[f*9]= [ f(y)\ [ e lk{z+v) g{z) dzX dy, 

J y— — oo Z — — oo J 

and now extracting the factor e zky , since this is independent of z, allows us to write 

/•OO /*oo 

Hf *g}= f(y)e iky dy / e ikz g(z) dz = T[f)T[g\. 

J y— — oo J z =— oo 




5.3 Green’s Functions Revisited 


Let’s, for the moment, forget our previous definition of a Green’s function (Sec- 
tion 4.1.2), and define it instead, for a linear operator L with domain — oo < x < oo, 
as the solution of the differential equation LG = S( x) subject to G — > 0 as \x\ — > oo. 
If we assume that G is a good function, we can use the Fourier transform to find 
G. For example, consider the operator 


so that LG = 6 is 


L = 


d*_ 

dx 2 


- 1 , 


d 2 G 

dx 2 


G = S(x). 


We will solve this equation subject to G - > 0 as |cc| — > oo. Taking the Fourier 
transform of both sides of this equation and exploiting the linearity of the transform, 
(5.29), gives 


T 


' d 2 G ~ 
dx 2 


-T[G] = T[6{x)\. 
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Firstly we need to determine the Fourier transform of the delta function. From the 
definition 

/ OO 

e ikx S(x)dx = (e ikx )\ x=Q = l. 

-OO 

Therefore 

-k 2 T[G] - P[G\ = 1, 

using the fact that T[G"] = —k 2 P[G\. After rearrangement this becomes 

PTT- 

which we know from (5.28) means that 

G(x) = -^K 

Why should we want to construct such a Green’s function? As before, the answer 
is, to be able to solve the inhomogeneous differential equation. Suppose that we 
need to solve Lcf> = P. The solution is (f> = G * P. To see this, note that 

/ OO pOO 

G{x — y)P(y) dy = / LG(x - y)P(y) dy 

-OO J — OO 


S(x~y)P{y)dy = P{x). 


Once we know the Green’s function we can therefore write down the solution of 
Lip = P as 4> = G * P. 

There are both differences and similarities between this definition of the Green’s 
function, which is sometimes referred to as the free space Green’s function, and 
the definition that we gave in Section 4.1.2. Firstly, the free space Green’s function 
depends only on x, whilst the other definition depends upon both x and s. We can 
see some similarities by considering the self-adjoint problem 

-T- (p( x )^r) + y( x ) G = s ( x )• 

dx \ dx J 


On any interval that does not contain the point x = 0, G is clearly the solution 
of the homogeneous problem, as before, and is continuous there. If we integrate 
between x = — e and x = e, we get 


p{x) 


dG 

dx 


J q(x)G(x) dx = J 8(x) dx = 1. 


If p{x) is continuous at x = 0, when we take the limit e — > 0 this reduces to 


dG 

dx 


a:=(r 


J tc— 0“ 


l 

m' 


which, apart from a sign difference, is the result that we obtained in Section 4.1.2. 
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The end result from constructing the Green’s function is the same as in Sec- 
tion 4.1.2. We can express the solution of the inhomogeneous boundary value prob- 
lem as an integral that involves the Green’s function and the inhomogeneous term. 
We will return to consider the Green’s function for partial differential equations in 
Section 5.5. 


5.4 Solution of Laplace’s Equation Using Fourier Transforms 


Let’s consider the problem of solving Laplace’s equation, V 2 </> = 0, for y > 0 subject 
to <j) = f(x) on y = 0 and </> bounded as y — > oo. Boundary value problems for 
partial differential equations where the value of the dependent variable is prescribed 
on the boundary are often referred to as Dirichlet problems. We can solve this 
Dirichlet problem using a Fourier transform with respect to x. We define 


<f>( k , y)= e lkx <p(x, y) dx = f[4>] . 


We begin by taking the Fourier transform of Laplace’s equation, noting that 


T 


dy 2 \ 


d 2 T r n _ d 2 (t> 

dy 2 ^^ dy 2 ’ 


which is easily verified from the definition of the transform. This gives us 

d 2 (j) 9 2 </>"| 

dy 2 dx 2 _ 


T 




This has solution <j> = A(k)e^ v +B(k)e v . However, we require that 4> is bounded 

as y — > oo, which gives A(fc) = 0, and hence 

4>(k,y) = B(k)e~ Wy . 


It now remains to satisfy the condition at y = 0, namely (fr(x,0 ) = f{x). We take 
the Fourier transform of this condition to give <j)(k, 0) = f(k), where f(k) = T[f]. 
By putting y = 0 we therefore find that B(k) = f(k), so that 

4>(k,y) = Kk)e~ Wy . 


We can invert this Fourier transform of the solution using the convolution theo- 
rem. Since 


F- 1 [f~g\=F- 1 [F[f]T[g]] = f* 9l 


the solution is just the convolution of the boundary condition, f{x), with the inverse 
transform of g(k) — e~\ k \ y . To find g(x) = J 7 ~ 1 [g(k)] = T~ l [e~^ v ], we note from 
(5.28) that J r [e _ l a: l] = 2/(1 + fc 2 ), and exploit the fact that T[f(ax)\ = f(k/a)/a, 
so that T[e~ a \ x \] = 2a/ (a 2 + k 2 ), and hence e~ a ^ = J r ~ 1 [2a/(a 2 + fc 2 )]. Using the 
formula for the inverse transform gives 


0 ~ a \x\ _ 

27T . 


—ikx 


2 a 


k 2 


dk. 
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We can exploit the similarity between the Fourier transform and its inverse by 
making the transformation k i— > —x, x i— > k, which leads to 


0 —a\k\ _ 


c ikx 2a dx = 1 T 2a 
a 2 + x 2 ' 27t a 2 + x 2 


and hence 


-1 e ~\k\y 


i J 7T {y 2 + x 2 ) 

In the definition of the convolution we use the variable £ as the dummy variable 
rather than y to avoid confusion with the spatial coordinate in the problem, so that 


f*9 = 


/( Og(x - 0 d£. 


The solution (f) can now be written as the convolution integral 

y) = Loo /(e) n{y 2 + (x-m d * = n Loo v 2 + {x- 0 2 (5 ' 30) 

As an example, consider the two-dimensional, inviscid, irrotational flow in the upper 
half plane (see Section 2.6.2), driven by the Dirichlet condition </>(£, 0) = f(x), 
where 


/ OO 

/( 0 ; 

-OO 


f(x) = 


1 for — 1 < x < 1, 

0 elsewhere. 


Using the formula we have derived, 

, y f 1 an 


V f dt; = y_ 1 i / £,-x 

TT y 2 + (x - £) 2 7T [y V y 


= — ^tan ^ — — J + tan ^ — - — jj. (5.31) 

Figure 5.5 shows some contours of equal 4>.\ 

Finally, let’s consider the Neumann problem for Laplace’s equation in the 
upper half plane, y > 0. This is the same as the Dirichlet problem except that the 
boundary condition is in terms of a derivative, with dcj)/dy = f(x) at y = 0. As 
before we find that (f>(k,y ) = B(k)e~^ y , and the condition at y = 0 tells us that 


and hence 


B(k) = 


4>i k ,y) = -f( k )- 


In order to invert this Fourier transform we recall that 


y 2 + x 2 


dx = 2ire , 


f See Section 2.6.2 for details of how to create this type of contour plot in MATLAB. 
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Fig. 5.5. Contours of constant potential function, <f>, given by (5.31). 


and integrate both sides with respect to y to obtain 



gikx 


log (y 2 + x 2 ) dx = —27 r 


e -\k\v 

~w 


In other words 


e -|fc| v 

TT 




and hence the solution is 

1 r°° 

y) = ^J /(£) lo §{y 2 + ( x - 0 2 } d t- 


5.5 Generalization to Higher Dimensions 

The theory of Fourier transforms and generalized functions can be extended to 
higher dimensions. This allows us to use the techniques that we have developed 
above to solve other partial differential equations. 


5.5.1 The Delta Function in Higher Dimensions 

If we let x = (xi, X 2 , • ■ • , x n ) be a vector in K", we can define the delta function 
in <5(x), through the integral 


6(x)F(x) d n x = F(0). 
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This is a multiple integral over the whole of R", and F is a good function in each 
of the coordinates x.-,. 


Cartesian Coordinates 

In Cartesian coordinates we have 8(x) = 8 (xi) 8 (x 2) ■ ■ .8(x n ), since 

f 8(x)F(x) d n x 

J R n 

= / <5(a;i) . . . 8(x n -i) < / S(x n )F(x 1 , . . . ,x n -\,x n ) dx n > d n ~ 1 x 

J R”- 1 l Jr J 

= f 6(x i) . . . 8 (x n -i)F(xi, , x n -i,0)d n ~ 1 x = ■■■ = F( 0, 0, . . . ,0). 
Ii ”- 1 

Plane Polar Coordinates 

In terms of standard plane polar coordinates, ( r,9 ), in R 2 , <5(x) must be isotropic, 
in other words independent of 9 , and a multiple of 6(r). We therefore let <5(x) = 
a(r)8(r) and, noting that d 2 x = rdrdd , 

/»27T /*oo /*oo 

/ / S(x)d 2 x = 2n / a(r)8(r)rdr. 

Je = o J r- o Jo 

By symmetry, 


,oo x 

S(r)dr= -, 


so we can take a(r) = 1 / 7r r, and hence 

^(x) = — 6(r). 

7 rr 

Spherical Polar Coordinates 

In R 3 , the isotropic volume element can be written as d 3 x = 4irr 2 dr, the volume 
of a thin, spherical shell, where r is now |x|, the distance from the origin. Again, 
the delta function must be a multiple of 8(r), and the same argument gives 

* (X) = 2^ (r) - 


5.5.2 Fourier Transforms in Higher Dimensions 

If /(x) = fix \,X2, ■ ■ ■ , x n ) and k = {k\, fc 2 , . . . , k n ), we can define the Fourier 
transform of / as 

/>)= [ /( X )e ikx d"x. (5.32) 

For example, in M 3 

/*00 /»00 nOO 

f(k 1 ,k 2 ,k 3 )= / / dan 

J X3 = — OO Jx 2 — — 00 J X\—~ OO 
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We can proceed as we did for one-dimensional functions and show, although we will 
not provide the details, that: 

(i) If /(x) is a generalized function, then its Fourier transform exists as a gen- 
eralized function. 

(ii) The inversion formula is 

/(x) = ^ / /(k)e _?k x d n k. (5.33) 

(27 T) n J Rn 

The inversion of Fourier transforms in higher dimensions is considerably more dif- 
ficult than in one dimension, as we shall see. 

Example: Laplace’s equation 

Let’s try to construct the free space Green’s function for Laplace’s equation in R 3 . 
We introduced the idea of a Green’s function for a linear operator in Section 5.3. 
In this case, we seek a solution of 

V 2 G = i5(x) subject to G — » 0 as |x| — > oo. (5.34) 


The Fourier transform of G is 




0 i(k 1 Xl+k 2 X 2 +k3X3) Q 


Gik 1 e iiklXl+k2X2+k3X3) dx i [ dx 2 da- 


after integrating by parts. Since G vanishes at infinity, we conclude that 


'dG' 

dx\_ 




Ge i(k 1 X 1 +k 2 X 2 + k 3 X3) dxi dx2 dx3 = _ ikl G. 


This result is, of course, analogous to the one-dimensional result that we derived 
in Section 5.2.5. 

If we now take the Fourier transform of (5.34), we find that 


{(-ifci) 2 + {-ik 2 ) 2 + (-ifc 3 ) 2 } G = [ <5(x)e* k ' x d 3 x = 1, 

J R 3 


and hence G = — 1/ 1 k| 2 . The inversion formula, (5.33), then shows that 

= ( “ 5) 

In order to evaluate this integral, we need to introduce spherical polar coordinates 
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( k , 9 , </>), with the line 9 = 0 in the x-direction. Then k • x = |k| |x| cos 9 = kr cos 9 
and (5.35) becomes 

-i /»oo n tv — ikr cos 6 

G(x) = — — — / / / k 2 sin 9 dk d9 d<j) 

8tu Jk = o Je = o J<t ,= o & 


1 r°° 

4?r 2 J k=0 


f „—ikr cos 0 


ifcr 


dfc = — 


0=0 


27T 2 



sin kr „ 

— dk 

kr 


1 r°° sin 2 _ 1 

27r 2 r J z —o z 4nr ’ 

using the standard result, (sin z/z) dz = 7r/2 (see, for example, Ablowitz and 
Fokas, 1997). 

Now that we know the Green’s function, we are able to solve the inhomogeneous 
problem 

V 2 (/> = Q(x) subject to <j> — > 0 as |x| — > oo, (5.36) 


in terms of a convolution integral, by a direct generalization of the results presented 
in Section 5.2.5. We find that 


. t f Q( y) 

4tt J r 3 |x - y 


d 3 y. 


This result is fundamental in the theory of electrostatics, where Q is the distribution 
of charge density, and <j> the corresponding electrical potential. 

Example: The wave equation 
Let’s try to solve the three-dimensional wave equationf, 

„ i d 2 u 

V 2 u = —z Tr~n~ for t > 0 and x e K 3 , (5.37) 

c 2 ot z 

subject to the initial conditions 

0u 

u{x,y,z, 0) = 0, —(x,y,z,0) = f(x,y,z). (5.38) 

The Fourier transform of (5.37) is 

= ~c 2 k 2 u, (5.39) 

where u is the Fourier transform of u and k 2 = k 2 + k% + k 2 . The initial conditions, 
(5.38), then become 

«(k,0) = 0, ^(k,0) = /(k). 

The general solution of (5.39) is 


u = A{ k)e ickt + B{ k)e~ ickt , 


f See Section 3.9.1 for a derivation of the wave equation in two dimensions, and Billingham and 
King (2001) for a derivation of (5.37) for sound waves. 
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and when we enforce the initial conditions, this becomes 

■u = (e ickt - e ~ ickt ) . 

2 ick ' 

The inversion formula, (5.33), then shows that 

u(x, t) = — } {exp (ickt — ik • x) — exp (— ickt — ?k • x)} d 3 k. (5.40) 

o7 J jj3 2 ick 

For any given value of k, the terms exp ( ±ickt — ik • x) represent plane travelling 
wave solutions of the three-dimensional wave equation, since they remain constant 
on the planes k.x = ±ckt, which move perpendicular to themselves, in the direction 
±k, at speed c. Since the integral (5.40) is a weighted sum of these plane waves at 
different wavenumbers k, we can see that the Fourier transform has revealed that 
the solution of (5.37) subject to (5.38) can be written as a continuous spectrum of 
plane wave solutions travelling in all directions. It is somewhat easier to interpret 
solutions like (5.40) in the large time limit, t^> 1, with |x| /t fixed. We will do this 
in Section 11.2.2 using the method of stationary phase. 


Exercises 

5.1 Determine the Fourier series expansion of f(x) for — 7r < x < tt, where (a) 
f(x) = e x , (b) /( x) = | sin a; | , (c) f(x) = x 2 . 

5.2 Determine the Fourier series expansion of the function 

, . _ J 0 for — 7T < x < 0, 

^ X) ~ { x for 0 < * < tt. 


5.3 

5.4 

5.5 

5.6 

5.7 

5.8 




Show that 

(a) 6(ax) = -^6(a:), 

M 

(b) 6( x 2 — a 2 ) = — {<5(a: + a) + 6(x — a)}. 

2a 

Express S(ax+b) in the form /x6(x + a) for appropriately chosen constants 
/r and a. 

What is the general solution of the equation (x — a)f(x ) = b, if / is a 
generalized function? 

If g{x) = 0 at the points Xi for i = 1,2,... , n, find an expression for 6(g(x)) 
in terms of the sequence of delta functions {<5(a: — a;,)}. 

Show that-^-sgn(x) = 2 6(x). What is 

dx dx 

Calculate the Fourier transforms of 
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{ 0 for x < —1 and x > 1, 
—1 for — 1 < x < 0, 

1 for 0 < x < 1, 


f(x) = 


e x for x < 0, 

0 for x > 0, 


f{x) = 


xe x for x > 0, 

0 for x < 0. 


Show that the Fourier transform has the ‘shifting property’, 
F[f(x-a)] = e ika F[f(x)]. 


m = J° 


-(x-ik/2) 2 dx 


show by differentiation under the integral sign that dJ/dk = 0. Verify that 
J(0) = 1/7 r and hence evaluate T e~ x 2 . 

Use Fourier transforms to show that the solution of the initial boundary 
value problem, 

dii d 2 u 

-7- = K7-7; — Xu, tor —00 < x < 00, t > 0 , 
at ox z 

with A and k real constants, subject to 

u(x, 0 ) = f(x), u — > 0 as |cn| — 00, 
can be written in convolution form as 

u(x ' f) = 7m L exp ,(y) dy ■ 

For neN and k £ Z, evaluate the integral 


kx inx i 

e e dx, 


and hence show that 


e kx cos nx dx = 


(— l)” +1 2nsinh kir 
k 2 + n 2 


Obtain the Fourier series of the function (of period 27 t) defined by 


f(x) = cosh kx for — n < x < tt. 
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5.13 


5.14 


Find the value of 


E 

n — 1 


k 2 


as a function of k and use (4.49) to evaluate 

“ 1 


E 


( k 2 + n 2 ) 2 


The forced wave equation is 

d 2 (f> 1 d 2 (j) 


dx 2 Cg dt 2 


= Q(x,t ), 


where Q is the forcing function, which satisfies Q — » 0 as |x| — » oo. 

(a) Show that if Q = q{x)e~ wt then a separable solution exists with 
(j> = e~ lut f( x) and f" + k%f = q(x), where k 0 = ui/cq. 

(b) Find the Green’s function solution of G" + k^G = S(x) that behaves 
like e ±lk ° x as x — > ±oo. What is the physical significance of these 
boundary conditions? 

(c) Show that 

0 = e~ iut G * q 


q(y) 


,ik 0 (x-y) 


dy- 


q(v) 


D -ik 0 {x-y) 


dy 


q(v) 


i-ikoy 


dy. 


2ik.Q 

(d) Show that, as x — > oo, 

— i(u>t—k 0 x) /-oo 



2iko 

What is the physical interpretation of this result? 

The free space Green’s function for the modified Helmholtz equation 
in three dimensions satisfies 

(V 2 - in 2 ) G = S(x i,x 2 , x 3 ), 

with G — > 0 as \x\ — > oo. Use Fourier transforms to show that 


G(x) = - 


1 

87T 3 


o— ikx 


'R 3 


k 2 


d d k. 


Use contour integration to show that this can be simplified to 

C(x) = —e-"-, 

where r = |x|. Verify by direct substitution that this function satisfies the 
modified Helmholtz equation. 
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Integral transforms of the form 

T[f{x)\ = J K( x , k )f( x ) dx > 

where I is some interval on the real line, K(x, k) is the kernel of the transform and k 
is the transform variable, are useful tools for solving linear differential equations. 
Other types of equation, such as linear integral equations, can also be solved using 
integral transforms. In Chapter 5 we met the Fourier transform, for which the kernel 
is e lkx and / = R. The Fourier transform allows us to solve linear boundary value 
problems whose domain of solution is the whole real line. In this chapter we will 
study the Laplace transform, for which the usual notation for the original variable 
is t and for the transform variable is s, the kernel is e~ st and I = R + = [0, oo). 
This transform allows us to solve linear initial value problems, with t representing 
time. As we shall see, it is closely related to the Fourier transform. 


6.1 Definition and Examples 

The Laplace transform of /(f) is 

/»oo 

C[f(t)]=F(s)= e~ st f{t) dt. (6.1) 

Jo 

We will consider for what values of s the integral is convergent later in the chapter, 
and begin with some examples. 


Example 1 

Consider /(f) = e fet , where k is a constant. Substituting this into (6.1), we have 


C[e kt ] = [ 
Jo 


e~ st e kt dt= e - {s ~ k)t dt = 

0 Jo 


e -(s-k)t 

~( s ~k) 


Now, in order that e _ ^ s_fc ^ — > 0 as t — > oo, and hence that the integral converges, 
we need s > k. If s is a complex-valued variable, which is how we will need to treat 
s when we use the inversion formula, (6.5), we need Re(s) > Re(fc). This shows 
that 


C[e kt ] = 


1 


s — k 


for Re(s) > Re(fc). 
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Example 2 


Consider /(f) = cos ojt, where w is a constant. Since cos cut = e lut + e lut ), 

£ [cos tut] = l f (e^-^ + e" (iu;+s)tN ) ds = \ , 

2 J 0 V / 2 \ s — iu) s + iuj J 

provided Re(s) > 0. We can combine the two fractions to show that 


£ [cos tut] = 


for Re(s) > 0. 


It is easy to show that £[sintuf] = oj/(s 2 + u 2 ) using the same technique. In the 
same vein, 

S CL 

£[cosh(af)] = -j jC[sinh(at)] = y. 


Example 3 

Consider f(t) = t n for n € R. By definition, 


/»oo 

C[t n ] = / e~ st t n 

Jo 


If we let x = st so that dx = s dt , we obtain 




x\ n 1 , 

— - dx = 


1 


? n+l 


dt. 


e~ x x n dx = 


T(n+1) 


„n+l 


If n is an integer, this gives C[t n ] = n\/s n+1 , a result that we could have obtained 
directly using integration by parts. 


In order to use the Laplace transform, we will need to know how to invert it, so 
that we can determine /(f) from a given function F(s). For the Fourier transform, 
this inversion process involves the integral formula (5.25). The inverse of a Laplace 
transform is rather similar, and involves an integral in the complex s-plane. We will 
return to give an outline derivation of the inversion formula, (6.5), in Section 6.4. 
For the moment, we will deal with the problem of inverting Laplace transforms by 
trying to recognize the function /(f) from the form of F(s). As we shall see, there 
are a lot of techniques available that allow us to invert the Laplace transform in an 
elementary manner. 


6.1.1 The Existence of Laplace Transforms 

So far, we have rather glossed over the question of when the Laplace transform 
of a particular function actually exists. We have, however, discussed in passing that 
the real part of s needs to be greater than some constant value in some cases. In 
fact the definition of the Laplace transform is an improper integral and should be 
written as 

,.N 

£[/(*)] = Jim / e ~ st f(t) dt. 

N^OO J o 
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We will now fix matters by defining a class of functions for which the Laplace 
transform does exist. We say that a function /(f) is of exponential order on 
0 ^ f < oo if there exist constants A and b such that |/(t)| < Ae bt for t € [0, oo). 

Theorem 6.1 (Lerch’s theorem) A piecewise continuous function of exponential 
order on [0, oo) has a Laplace transform. 

The proof of this is rather technical, and the interested reader is referred to Kreider, 
Kuller, Ostberg and Perkins (1966). 

Theorem 6.2 If f is of exponential order on [0, oo), then C[f] — > 0 as |s| — » oo. 


Proof Consider the definition of C[f(t)] and its modulus 


\mm = 


e s f{t)dt 


POO /»C 

< \e-«m\dt = 

Jo Jo 


\f(t)\ dt , 


using the triangle inequality. If / is of exponential order, 


/»oo /»oo 

|£[/(t)]| < / e~ st Ae bt dt= Ae^ b ~ s)t dt 
Jo Jo 


A 

b — s 


0 (b-s)t 


= lim 


J o 


A 

b — s' 


o(b—s)Y _ 


A ' 
b — s 


Provided s > b, we therefore have 


\mm < 


A 

s — 6’ 


and hence |£[/(f)]| — > 0 as |s| — * oo. 


□ 


Conversely, if lim^oo F(s) yf 0, then F(s) is not the Laplace transform of a function 

of exponential order. For example, s 2 /(s 2 + 1) — ■> 1 as s — > oo, and is not therefore 

the Laplace transform of a function of exponential order. 

In contrast, if / is not of exponential order and grows too quickly as t — > oo, 

the integral will not converge. For example, consider the Laplace transform of the 
,2 

function e l , 


C 



wv 

lim / e* e~ st dt. 

Jo 


,2 , 

It should be clear that e grows faster than e s for all s, so that the integrand 

,2 

diverges, and hence that the Laplace transform of e does not exist. 


6.2 Properties of the Laplace Transform 

Theorem 6.3 (Linearity) The Laplace transform and inverse Laplace transform 
are linear operators. 
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Proof By definition, 


C[af{t) + (3g(t)] 



(ocf(t) + (3g{t)) dt 


= a\ e st f{t) dt + P e st g(t) dt = aC[f(t)] + PC[g(t)\, 

Jo Jo 

so the Laplace transform is a linear operator. Taking the inverse Laplace transform 
of both sides gives 

af(t ) + (3g{t) = CT 1 [aC[f(t)] + (JC[g(t)]} . 

Since we can write f(t ) as £“ 1 [£[/(t)]] and similarly for g{t), this gives 
aC- 1 [£[/(*)]] +f3C- 1 [C[g(t)}} = Cr x [aC[f{t)}+ UC[g{t)]] . 

If we now define F(s) = C[f] and G(s ) = C[g], we obtain 

aC-^F] + (3C~ l [G } = CT X [aF + fJG ] , 


so the inverse Laplace transform is also a linear operator. 


□ 


This is extremely useful, since we can calculate, for example C[2t 2 — t + 1] and 
£[e 3t + cos2f] easily in terms of the Laplace transforms of their constituent parts. 

As we mentioned earlier, the Laplace inversion formula involves complex inte- 
gration, which we would prefer to avoid when possible. Often we can recognize 
the constituents of an expression whose inverse Laplace transform we seek. For 
example, consider £ _1 [l/(s 2 — 5s + 6)]. We can proceed by splitting this rational 
function into its partial fractions representation and exploit the linearity of the 
inverse Laplace transform. We have 


c~ x 

1 

II 


_s 2 — 5s + 6 



1 

(s — 2)(s — 3) 


= £" 1 



1 


1 


1 

= -c~ x 

+ £~ 1 

s — 3 


s — 2 

s 3 


„ 3 1 


We pause here to note that the inversion of Laplace transforms using standard 
forms is only possible because the operation is a bijection, that is, it is one-to-one 
and onto. For every function f(t ) the Laplace transform £[/(t)] is uniquely defined 
and vice versa. This is a direct consequence of Lerch’s theorem, 6.1. 


Theorem 6.4 (First shifting theorem) If C\ fit)] = F(s) for Refs) > b, then 
C[e at f{t)] = F(s - a) for Refs) > a + b. 


Proof By definition, 

/»oo /»oo 

£[e a 7W] = / e~ st e at f{t) dt = / e^-^fit) dt = F(s - a), 

Jo Jo 

provided that Re(s — a) > b. □ 
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Example 1 

Consider the function f(t) = e 3t cos4L We recall from the previous section that 


C [cos At] = 


4 2 


for Re(s) > 0. 


Using Theorem 6.4, 


C e 3t cos At] = 


s- 3 


(s- 3) 2 +4 2 


for Re(s) > 3. 


Example 2 

Consider the function F(s) = s/(s 2 + s + l). What is its inverse Laplace transform? 
We begin by completing the square of the denominator, which gives 

. * + ! i 

F(s) = 


( s + h ) +1 ( s +s) +1 ( s + 5 ) +1 


and hence 

zr 1 

Using the first shifting theorem then gives us 


s 

— 

s + \ 


i 

i 

_s 2 + s + 1 

A-/ 

L( s +d 2 +iJ 

ys 

( s + l) +|J 


C 


-l 


s 2 + s + 1 


= e cos 


\/Zt 1 —t/2 ■ 


2 V3 6 


sm 


Theorem 6.5 (Second shifting theorem) If the Laplace transform of f(t) is 
F(s), then the Laplace transform of the function g(t) = H(t—a)f(t—a) is e~ sa F(s), 
where H is the Heaviside step function. 


Proof By definition, 

POO aOO 

£[s(i)] = / g(t)e~ st dt = / f{t-a)e~ st dt. 

Jo j a 

By writing r = t — a, this becomes 

poo 

C[g{t)\ = / f{r)e~ ST e~ sa dr = e~ sa F(s), 

Jo 

since the definition of the Laplace transform, (6.1), can be written in terms of any 
dummy variable of integration. □ 


For example, to determine the inverse transform of the function e -3s /s 3 , we firstly 
note that C[t 2 ] = 2/s 3 , using Example 3 of Section 6.1. The second shifting theorem 
then shows immediately that 


C~ l 



\ H (t — 3)(t — 3) 2 . 
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6.3 The Solution of Ordinary Differential Equations Using Laplace 
Transforms 


In order to be able to take the Laplace transform of a differential equation, we will 
need to be able to calculate the Laplace transform of the derivative of a function. 
By definition, 

nO O 

£[/']=/ e~ st f'{t) dt. 

Jo 

After integrating by parts, we find that 

POO 

£[/']= [e~ st f(t)]7- / - s e- st f(t)dt . 

J o 

At this stage, we will assume that the values of s are restricted so that e~ st f(t) — > 0 
as t — » oo. This means that 


C[f] = sC[f } - /( 0). 
A useful corollary of (6.2) is that, if 


( 6 . 2 ) 


9(t) = [ f{j)dr, 
Jo 


so that, except where f(t) is discontinuous, g'{t) = /(f), we have C[f] = sC[g\ — </(0). 
Since g( 0) = 0 by definition, 


C[g\ = C 


f{r)<h 


= 


and hence 


C~ 


\ ns) 


= [ f( T ) dr. 
Jo 


This can be useful for inverting Laplace transforms, for example, 

F (s) = 7 2 1 9V 

s(s^ + a; 2 ) 

We know that L [sin ut\ /u = l/(s 2 + w 2 ) so that 

m = z - 1 


T 1 


ri „ 

'1 1' 


= C- 1 

-c 

— sin uit 

_S S 2 +UJ 2 _ 


s 

CO 


r * i i 

= sinwr dr = — (1 — cos wt) . 
Jo U u z 


Let’s now try to solve the simple differential equation 

subject to the initial condition y(0) = 1. Of course, it is trivial to solve this 
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separable equation, but it is a useful illustrative example. We begin by taking the 
Laplace transform of the differential equation, which gives 

\dy] 


C 


dt 


- 2 C\y\ = sY{s) - i/(0) - 2Y (s) = 0, 


where Y(s) = C[y(t)]. Using the initial condition and manipulating the equation 
gives 


Y(a) = 


1 


s-2’ 

which is easily inverted to give y(t) — e 2t . 

Many of the differential equations that we will try to solve are of second order, 
so we need to determine £[/"]. If we introduce the function g(t) = (6.2) 

shows that 

C[g'} = sG(s ) -5(0), 

where G(s) = C[g\ = C[f] = sF(s) — /( 0). We conclude that 

£[/"] = C\g'\ = s 2 F(s) sf( 0) - /'( 0). (6.3) 

We can obtain the same result by integrating the definition of the Laplace transform 
of f" twice by parts. It is also straightforward to show by induction that 

£[/ (n) ] = a n F(s) - s n_1 /(0) - s n ~ 2 f\ 0) / (n - 1} ( 0). 

For example, we can now solve the differential equation 

d 2 y dy 

subject to the initial conditions y( 0) = 0 and ^'(0) = 1 using Laplace transforms. 
We find that 

s 2 F(s) - sy(0) - y'(0) - 5( S y(s) - y(0)) + 6Y = 0. 


Using the initial conditions then shows that 

Y(a) = 


1 


s 2 — 5s + 6 

In order to invert the Laplace transform we split the fraction into its constituent 
partial fractions, 

y M = ^3 - ^2- 

which immediately shows that 

y(t) = e 3t - e 2t . 

Let’s now consider the solution of the coupled system of equations 
dyi , dy 2 

~dt +yi = V2 ' -df- y2 = Vl ’ 

subject to the initial conditions that y\ (0) = y 2 ( 0) = 1. Although we could 
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combine these two first order equations to obtain a second order equation, we will 
solve them directly using Laplace transforms. The transform of the equations is 

sYj. - 1 + n = VT sY 2 -1-Y 2 = Y u 

where £[yj(t)\ = Y)(s). The solution of these algebraic equations is 


^(s) = 


W = ^|. 


s 2 - 2’ 

which we can easily invert by inspection to give 

j/i (t) = cosh V%t j ?/2 ( t ) = cosh V2t + \/2 sinh V%t. 

Note that the reduction of a system of differential equations to a system of algebraic 
equations is the key benefit of these transform methods. 

Of course, for simple scalar equations, it is easier to use the standard solution 
techniques described in Appendix 5. The real power of the Laplace transform 
method lies in the solution of problems for which the inhomogeneity is not of a 
simple form, for example 

cPy dy 

d^ + dt =6{t ~ 1] - 

In order to take the Laplace transform of this equation, we need to know the Laplace 
transform of the delta function. We can calculate this directly, as 


/»00 

C[b(t — a)}= / 6(t — a)e~ si dt = e~ sa , 

Jo 


so that the differential equation becomes 

s 2 Y (s) - S 2/(0) - 2/(0) + sY (s) - 2/(0) = e -s , 

and hence 


F(s) = 


(s + l)2/(0) + 2/(0) 

s(s + 1) 


Judicious use of the two shifting theorems now allows us to invert this Laplace 
transform. Note that 

~r 


£- 


= 1, 


and, using the first shifting theorem, 

C" 1 


1 

s + 1 


This means that 


£" 


1 

1 

= £ -1 

'1 1 


_s(s + 1)_ 

_s s + 1 



= 1 — e 


and, using the second shifting theorem, 


£ 


-1 


s(s + 1) 


= H{t- l)(l-e" (t_1) ). 
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Combining all of these results, and using the linearity of the Laplace transform, 
shows that 

y(t) = y( 0) + y'(0)(l - e" 4 ) + H(t - 1)(1 - e"**" 1 )). 

6.3.1 The Convolution Theorem 

The Laplace transform of the convolution of two functions plays an important 
role in the solution of inhomogeneous differential equations, just as it did for Fourier 
transforms. However, when using Laplace transforms, we define the convolution of 
two functions as 

f*9 = [ f(r)g(t-T)d,T. 

Jo 

With this definition, if f(t) and g(t) have Laplace transforms, then 

C[f*g\ = C[f(t)\C[g(t)\. 

We can show this by taking the Laplace transform of the convolution integral, which 
is itself a function of t, to obtain 

pOO pt 

£[f*g\= e ~ st / f( T )g(t — t) dr dt. 

Jo Jo 

Since e~ st is independent of r it can be moved inside the inner integral so that we 
have 

pOO nt 

£[/ * g\ = / / e-~ st f{r)g{t - r) dr dt. 

Jo Jo 

We now note that the domain of integration is a triangular region delimited by the 
lines r = 0 and t = t, as shown in Figure 6.1. If we switch the order of integration, 
this becomes 

pT = OO POO 

£[/ *g]= / e~ st f{T)g(t - t) dt. dr. 

J T — 0 J t—T 

If we now introduce the variable z = t — t, so that dz = dt, we can transform the 
inner integral into 

POO pOO 

£[f*g\= f (d~) I e~ s(z+T) g{z) dz dr , 

Jo Jo 

and hence 

pOO pOO 

£[f*g\= f(r)e~ ST dr / e~ sz g(z) dz = C[f] C[g\. 

J T—0 J z—0 

This result is most useful in the form 

£- 1 [F(s)G(s)] = £- 1 [^)]*>C- 1 [G( S )]. 
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Fig. 6.1. The domain of integration of the Laplace transform of a convolution integral. 


Example 1 

Consider the Laplace transform F(s) = l/(s — 2)(s — 3). This rational function 
has two easily recognizable constituents, namely l/(s — 2) = £[e 24 ] and l/(s — 3) = 
£[e 34 ], and hence 


C 


-l 


(s — 2)(s — 3) J 


II 

h, 

1 

* jC 1 

1 

s — 2 

s — 3 

J 



= e 24 *e 34 . 


We now need to calculate the convolution integral, which is 


2 1 , 3 1 

e * e 


e 2r e 3(i-r) ^ = 


3 1 


dr 


' T = 0 


/ T— 0 


= e3t [-e“ r ] o = e 3t (-e- 4 + l) = -e 24 + e 34 . 

This is, of course, the same result as we obtained using partial fractions. 


Example 2 

Consider the Laplace transform 

F(s) = - 

(s 2 + l)(s — 2) ' 
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This is the product of the Laplace transforms of cos t and e 2t , and hence is the 
Laplace transform of 

cos f * e 2t = J e 2r cos(t — r) dr = ^ J e 2r dr 


p i(t—T )+2t 


s -i(f-T)+2r 


L 2(2 + i) 2(2 — z) I 


2 1 

= -e — - (2 cos t — sm t) . 
0 0 


This is somewhat easier than the partial fractions method. 


Example 3: Volterra integral equations 
A Volterra integral equation can be written in the standard form 

y(t) = f(t ) + f y(r)K[t — r) dr for t > 0. (6-4) 

Jo 

The function K is called the kernel of the equation. The integral is in the form 
of a convolution and, if we treat t as time, is an integral over the history of the 
solution. The integral equation (6.4) can therefore be written as 

y = f + y*K. 


If we now take a Laplace transform of this equation, we obtain 
Y = F + C[y * K] = F + YC[K ], 

and hence 

Y = 


F 


1 — C[K] ’ 

where F(s) = £[f ] and Y = C[y\. For example, to solve 


y(t) = 1 + f (t- r)y(r) dr, 
Jo 


for which /(f) = 1 and K(t) = t, we note that F = 1/s and C\K\ = 1/s 2 , and 
hence 

s z — 1 

This gives us the solution, y(t) = cosh t. 


6.4 The Inversion Formula for Laplace Transforms 

We have now seen that many Laplace transforms can be inverted from a knowledge 
of the transforms of a few common functions along with the linearity of the Laplace 
transform and the first and second shifting theorems. However, these techniques 
are often inadequate to invert Laplace transforms that arise as solutions of more 
complicated problems, in particular of partial differential equations. We will need 
an inversion formula. We can derive this in an informal way using the Fourier 
integral, (5.23). 
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Let g{t) be a function of exponential order, in particular with 7 the smallest 
real number such that e~ lt g(t) is bounded as t — > 00 . As we have seen, g(t) 
therefore has a Laplace transform G(s), which exists for Re(s) > 7 . We now define 
h(t) = e~ 11: g{t)H{t). Since h(t) is bounded as t — > 00 , the Fourier integral (5.23) 
exists, and shows that 

-1 POO pOO 

h(t)=— e~ ikt / e ikT h(T) dT dk, 

J — OO J — OO 


and hence that 




1 

27 r 




e -(7 -ik)T g^T) dT dk. 


If we now make the change of variable s = 7 — ik, so that ds = —i dk, we find thatf 


9(t) 


1 

2iri 


p7+ioo 


/ 7—200 



e~ aT g{T) dT ds. 


Finally, from the definition of the Laplace transform, we arrive at the inversion 
formula, sometimes called the Bromwich inversion integral, 


1 /■7+ioo 

g(t)=— e at G{s)ds. (6.5) 

Z7T7 J sy—ioQ 

Note that the contour of integration is a vertical line in the complex s-plane. Since 
G(s) is only guaranteed to exist for R.e(.s) > 7 , this contour lies to the right of any 
singularities of G(s). It is often possible to simplify (6.5) by closing the contour 
of integration using a large semicircle in the left half plane. As we shall see in the 
following examples, if the contour can be closed in this way, the result will depend 
crucially on the residues at any poles of e at G(s). 


Example 1 

We start with the simple case G(s ) = (3/ (s — a). Consider the integral 



where the closed contour C is shown in Figure 6.2. By the residue theorem, I{t) is 
equal to the sum of the residues at any poles of e st G(s ) enclosed by C. Let’s assume 
that the contour encloses the simple pole of G(s) at s = a, and hence that the 
straight boundary of C lies to the right of the pole at s = a. The residue theorem 
then shows that I(t) = (3e at . As b — > 00 , the semicircular part of the contour 
becomes large and, since |G(s)| is algebraically small when |s| ^ 1 , we conclude 
from Jordan’s lemma (see Section A6.4) that the integral along the semicircle tends 
to zero. On the straight part of the contour C, as b — > 00 we recover the inversion 
integral (6.5), so that /(f) = g(t ), and hence the inverse Laplace transform is 
g(t) = /3e at , as we would expect. 


f This is the point at which the derivation is informal, since we have not shown that this change 
in the contour of integration is possible. 
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lm(s) 



Re(s) 


Fig. 6.2. The contour C used to evaluate the inverse Laplace transform of P/(s — a). 


Example 2 

Let’s try to determine the inverse Laplace transform of G(s) = l/(s 2 + 1). In this 
case G(s) has simple poles at s = ±i, since 

G(s) = (— —) . 

2i \s — i s + ij 

Choosing the contour G as we did in Example 1, we again find that g{t ) is the sum 
of the residues at the two simple poles of e at G(s), and hence that, as expected, 

9(t) = ^ (e lt -e~ lt ) = sin t. 


Example 3 

We now consider a Laplace transform that has a nonsimple pole, G(s) = l/(s — a) 3 . 
The simplest way to calculate the residue of e st G(s ) at s = a is to note that 


(s — cc) 3 (s — a) 3 


»(s-a)t _ 


(s — a) 3 


1 T (s — cr)f T -(s — o) 2 f 2 T 


and hence that g(t) = ^t 2 e at . We can check this result by noting that C[t 2 ] = 
r(3)/s 3 = 2/s 3 and using the first shifting theorem. 


Example / 


1 /2 

Consider the inverse Laplace transform of G(s) = e~ as / s. Since G(s) contains 
a fractional power, s 1 / 2 , the point s = 0 is a branch point, and the definition of 
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G(s) is incomplete. We need to introduce a branch cut in order to make G(s) 
single- valued. It is convenient to place the branch cut along the negative real axis, 
so that, if s = |s|e* e with —ir < 8 < 7r, s 1 / 2 = +^/\s\e ie / 2 and the real part of s 1 / 2 
is positive. We cannot now integrate e st G(s) along any contour that crosses the 
negative real axis, such as G. Because of this, we use Cb, which avoids the branch 
cut, as shown in Figure 6.3. This contour also includes a small circle around the 
origin of radius e, since G(s) has a simple pole there, and is often referred to as a 
keyhole contour. 



Fig. 6.3. The keyhole inversion contour, Cb, used for inverting Laplace transforms with 
a branch cut along the negative real axis. 


Since e st G(s ) is analytic within Cb, the integral around Cb is zero, by Cauchy’s 
theorem (see Appendix 6). On the circular arcs AB and EF, G(s) — ■> 0 exponen- 
tially fast as b — » oo, since s 1 / 2 has positive real part, and hence the contributions to 
the integral from these arcs tend to zero. As before, the integral along the straight 
contour AF tends to g(t) as b — > oo, by the inversion formula (6.5). We conclude 
that 


9 0 ) = - , lim { / 
b^oo Zrn [JbC 




(6.7) 


Let’s consider the contributions from the lines BC and DE. We can parameterize 
these lines as s = xe± lT: respectively. In this way, we ensure that we use the correct 
value of s 1 / 2 on either side of the branch cut. Along BC, s 1 / 2 = a: 1 / 2 e* 7r / 2 = ix 1 ^ 2 , 
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and hence, in the limit as e — > 0 and b — > oo 

r „ st—as 1 ^ 2 


IBC 


0 ^ — xt—aix 1//2 

ds = / dx. 

x 


Similarly 


and hence 


r pSt—as 1 / 2 

/ dS= / 

IDE s Jo 


1/2 


dx, 


I BC J DE 


A/ 2 


ds = 2i 


°° e xt sin ax 1 / 2 


■ dx. 


In order to calculate this integral, we note that 


f°° e xt sin ax 1 / 2 /'°° e 3:4 ^ (ax 1 / 2 ) 2 " - 1 

1= dx = / > — 7 - — — dx, 

do ® Jo X ^ ( 2 n-l)! 


(6.8) 


using the Taylor series expansion of sin ax 1 / 2 . Since this series is uniformly con- 
vergent for all x, we can interchange the order of summation and integration, so 
that 

„27i-i r°° 


J = E 


-( 2 n-l)! J 0 


-Xt x u- 3/2 dx = Y^ 


00 2r? — 1 

— a 1 1 


n— 1 


(2n — 1)! t n ~ i/ 2 


3/2 dx 




n— 1 


2n— 1 r (»~ |) 

(2n — 1)! 


using the change of variable X = xt. Now, since 

ri„-iWn-rw»-rub-?v„-?v ..wi 


2 V 2 


= x— [ ( 2 n — 3 ) ( 2 n — 5 ) ... 3 • 1 • y/Jr, 


we find that 


T(n-i) (2n — 3)(2n — 5) ... 3 • 1 

(2n- 1)! “ 2 n ~ 1 (2n — l)(2n — 2) ... 3 • 2 • 1 

0 r 1 _ \[k 

2 n ~ 1 {2n - 1) (2 n - 2){2n - 4) . . . 4 • 2 " 2 2 (™- 1 )(2n - l)(n - 1)! ’ 


and hence 


1 = 2 




2n-l 


^ i \2t 1 / 2 J (2n — l)(n — 1)! 

«/2t 1 / 2 


= 2y / 7r 


»/2t 1/2 oo / 2\n-l 

V } da 
' (n — 1)! 


n— 1 


= 2^71 


e s ds = tt erf 
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where erf(a:) is the error function, defined by 

9 C x 

erf(x) = / e~ q dq. 

v 7r Jo 

We can parameterize the small circle CD using s = ee l6 , where 9 runs from 7 r to 
— 7r. On this curve we have s 1 / 2 = e 1 / 2 e* 0 / 2 , so that 


t — rvt- f 1 / 2 


r- 7T „ete ie -a e 1 / 2 e i ®/ 2 


ds = 




I CD 


ee 


z# 


d0 = 


i d9. 


As e — > 0 we therefore have 


S 1 / 2 


ds = 


idd = —2iri. 

J CD s J-rr 

If we now use (6.8) and (6.9) in (6.7), we conclude that 


/r 1 


o 1 / 2 


s 


1 — erf 




(6.9) 


where erfc(a;) is the complementary error function, defined by 

2 f°° _ 2 

erfc(x) = / e q dq. 

V* Jx 

Laplace transforms of the error function and complementary error function arise 
very frequently when solving diffusion problems, as we shall see in the following 
example. 


Example 5: Flow due to an impulsively-started flat plate 

Let’s consider the two-dimensional flow of a semi-infinite expanse of viscous fluid 
caused by the sudden motion of a flat plate in its own plane. We will use Cartesian 
coordinates with the a:-axis lying in the plane of the plate and the y-axis pointing 
into the semi- infinite body of fluid. This is a uni-directional flow with velocity 
u(x,y,t) in the ^-direction only and associated scalar pressure field p(x,y,t). The 
continuity equation, u x + v v = 0, which we derived in Chapter 2, shows that 
the streamwise velocity, u, is solely a function of y and t. We now consider the 
streamwise momentum within a small element of fluid, as shown in Figure 6.4. 
Balancing forces in the .E-direction, and taking the limit 6x, 6y, St — > 0, we find 
that 


Du 

p m =p 


du du 

di +U d. x 


dp dr 
dx dy ’ 


where p is the density and r is the shear stress. Here we have used the convective 
derivative, which we derived in Chapter 2. For a one-dimensional flow, it is found 
experimentally that r = pdu/dy, where p is the dynamic viscosity (see Acheson, 
1990). To be consistent with u = u(y , t), we now insist that d/d: r = 0. This reduces 
the a:- momentum equation to Ut = vu v v , where we have introduced the quantity 
v = p/p, the kinematic viscosity. Flows with high values of v are extremely 
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x (x, y + Sy, t) 


p (x + Sx, y, t) 

► -< 

P (x, y, t) 


-< 

i (x, y, t) 


Fig. 6.4. The ^-momentum balance on a small element of fluid. 


viscous, for example tar, lava and mucus, whilst those with low viscosity include 
air, water and inert gases. For example, the kinematic viscosity of lava is around 
10 m 2 s -1 , whilst the values for air and water at room temperature and pressure, 
10 -6 m 2 s -1 and 1.5 x 10 5 m 2 s - 1 , respectively, are very similar. 

Our final initial-boundary value problem is 


du d 2 u 

~dt =V fry 2’ 

(6.10) 

to be solved subject to 


u = 0 when t = 0 for y > 0, 

(6.11) 

u = U, at y = 0 for t > 0, 

(6.12) 

u — > 0, as y — > oo for t > 0. 

(6.13) 

This is known as Rayleigh’s problem. Equation (6.10) states that the only pro- 
cess involved in this flow is the diffusion of x-momentum into the bulk of the fluid. 


Of course, this initial-boundary value problem can also be thought of as modelling 
other diffusive systems, for example, a semi-infinite bar of metal, insulated along 
its sides, suddenly heated up at one end. 

In order to solve this initial-boundary value problem, we will take a Laplace 
transform with respect to time, so that 

nOO 

C[u(y, t)} = U(y,s) = / e~ st u(y , t) dt. 

Jo 
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Taking two derivatives of this definition with respect to y shows that 


C 


d 2 i 


dy 2 


d 2 U 

dy 2 


In general, y-derivatives are not affected by a Laplace transform with respect to t. 
After using (6.2) to determine the Laplace transform of du/dt, (6.10) becomes 

d 2 U 


sU = v 


dy 2 ' 


(6.14) 


The transform variable only appears as a parameter in this equation, which there- 
fore has the solution 


U{y,s) = A{s)e s 


B(s)e 


Since u — > 0 as y — » oo, we must also have U — * 0 as y — > oo, so that A(s) = 0. We 
now need to transform the boundary condition (6.12), which gives 


/•OO nC 

U (0, s) = / w(0,f)e _st dt = 

Jo Jo 


Ue st dt = 


U 


and hence B(s ) = U/s. We conclude that 


U(y,s) = — e v' v . 

s 

From the result of the previous example, we find that 

u(y, t) =U erfc 


(JL= ). 

\2\/ ut J 


Some typical velocity profiles are shown in Figure 6.5. These are easy to plot, 
since the error and complementary error functions are available as erf and erfc in 
MATLAB. 

We can also consider what happens when the velocity of the plate is a function 
of time. All we need to change is the boundary condition (6.12), which becomes 


= Uf{t), at y = 0 for t > 0. 


(6.15) 


The Laplace transform of this condition is U( 0,s) = UF(s), where F(s) is the 
Laplace transform of /(f). Now B(s) = F(s ), and hence 


U(y,s)=UF(s)e 


-s 1,2 y/A /2 -■ ' e 


= UsF(s)- 


— A /2 y/W 2 


Using (6.2), we can see that sF(s) = C[f(t)] + /( 0), and hence 


U(y,8)=U{C[f’(t)]+m}C 


erfc 


\2 y/vt ) , 


(6.16) 


(6.17) 


We can now invert this Laplace transform using the convolution theorem, to give 


u(y, t)=U 


[ {f(t~T) + f( 0)}erfc 

Jt= 0 


y 


dr 


(6.18) 
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Exact solutions of the equations that govern the flow of a viscous fluid, the Navier 
Stokes equations, are very useful for verifying that numerical solution methods 
are working correctly. Their stability can also be studied more easily than for flows 
where no analytical solution is available. 

By evaluating (6.18) numerically, we can calculate the flow profiles for any /(f). 
A MATLAB function that evaluates u(y,t ), for any specified /(f), is 


function rayleigh(yout ,tout) 


for t = tout 
u = [] ; 
for y = yout 

u = [u quadl (©integrand, 0,t , 1CT-4, 0,t ,y)] ; 

end 

plot(u,yout) , xlabel( ; u’), ylabel(’y’) 
title (strcat ( ’t = 1 ,num2str (t) ) ) , Xlim([0 1]) 
pause (0 . 5) 

end 


function integrand = integrand (tau,t ,y) 
nu = 1 ; 

df = f (t-tau) ; 

integrand = df . *erf c (y/2 . /(eps+sqrt (nu*tau) ) ) ; 


function f = f(t) 

df = 2*cos(2*t); fO = 0; f = df + fO; 


This uses the MATLAB function quadl to evaluate the integral correct to four 
decimal places, and then plots the solution at the points given in yout at the 
times given in tout. In the example shown, /(f) = sin2f, which corresponds to 
an oscillating plate, started from rest. Note that we add the inbuilt small quantity 
eps to the denominator of the argument of the error function to avoid MATLAB 
producing lots of irritating division by zero warnings. In fact, MATLAB is able to 
evaluate erfc(y/0) = erfc(Inf) = 0 for y positive. 

As a final example, let’s consider what we would do if we did not have access 
to a computer, but wanted to know what happens when the plate oscillates, with 
/(f) = sinwf. Since F(s) = to/(s 2 + u 2 ), (6.16), along with the inversion formula 
(6.5), gives 


u{y,t) 


1 

27 ri 


/* 7+200 


^st-ysfjjv 
s 2 + u > 2 


ds, 


(6.19) 


where 7 > 0, since the integrand has poles at s = ±iu>. Although (6.18) is the 
most convenient form to use for numerical integration, (6.19) gives us the most 
helpful way of approaching this specific problem. We proceed as we did in Example 
4, evaluating the integral on the contour Cb shown in Figure 6.3. The analysis 
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Fig. 6.5. The flow due to an impulsively-started flat plate when U = 1 and v = 1. 


proceeds exactly as it did in Example 4, except that now the integral around Cb 
is equal to the sum of the residues of the integrand at the poles at s = ±ioj. In 
addition, the integral around the small circle CD tends to zero as e — > 0. After 
taking all of the contributions into account (see Exercise 6.8), we find that 


u{y,t) 


UJ 
7 r 


f°° e- CTt sin (y/Zy) 


da 


: \'Sil 


sin < u>t — 



( 6 . 20 ) 


The first term, which comes from the branch cut integrals, tends to zero as t — > oo, 
and represents the initial transient that arises because the plate starts at rest. The 
second term, which comes from the residues at the poles, represents the oscillatory 
motion of the fluid, whose phase changes with y and whose amplitude decays ex- 
ponentially fast with y. This second term therefore gives the large time behaviour 
of the fluid. You can see what this solution looks like by running the MATLAB 
function that we gave above. 
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Exercises 
6.1 


6.2 


6.3 


6.4 


6.5 


6.6 


Find the Laplace transforms of the functions, (a) t(t+l)(t+2), (b) sinh(wf), 
(c) cosh(wf), (d) e -t sinf. Specify the values of s for which each transform 
exists. 

Find the inverse Laplace transforms of the functions, (a) l/(s 2 — 3s + 2), 
(b) l/s 3 (s+l), (c) l/(2s 2 — s+1), (d) l/s 2 (s+l) 2 , (e) (2s+3)/(s 2 -4 S +20), 
(f) l/(s 4 + 9s 2 ), (g) (2s - 4)/(s - l) 4 , (h) (s 2 + l)/(s 3 - s 2 + 2s - 2), 

(i) s 3 / (s + 3) 2 (s + 2) 2 . 

Using Laplace transforms, solve the initial value problems 

(a) = 18i > subject to y(0) = 0, y (|) =0, 

(b) - 4^| + 2>y = f(t), subject to y( 0) = 1 and y'(0) = 0, 

(c) Tt +x + 2y = t ' d d^ + 5x + 3 tt = °’ 
subject to 2/(0) = ®(0) = 2/(0) = 0, 

( d ) (~T^ ~ 4 4 + V = /(*)> subject to 2/(0) = -2, y'( 0) = 1, with 


dt 2 


dt 


m = 


t if 0 < t < 3, 
t + 2 if t ^ 3. 


. . dx dy „ dx dy , . . . 

(e) — + 2; + — = 0, -77 ~ x + 2 -j- = e *, subject to 2i(0) = 2/(0) = 1. 

dt dt dt dt 

d 2 x d 2 y 

(f) — — - = —2x + y, ——r = x — 2y, subject to the initial conditions 

dt 2 dt 2 

x(0) = 2/(0) = 1 and x'(0) = 2/(0) = 0. 

Using Laplace transforms, solve the integral and integro-differential equa- 
tions 


1 /'* 

(a) y(t) = t+-j y(r)(t — r) 3 dr, 

j*t ^ 

(b) / y(r) cos (t — r) dr = — , subject to 2/(0) = 1. 

Jo dt 


Show that C[tf(t)\ = — dF/ds , where F(s) = C[f(t)]. Hence solve the 
initial value problem 

d 2 x dx 

—-^r + 2t — 42: = 1, subject to x(0) = x'(0) = 0. 

dt 2 dt 

Determine the inverse Laplace transform of 

(s 2 + l)(s — 2) 

using (a) the inversion formula (6.5), and (b) the residue theorem. 
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6.7 Find the inverse Laplace transform of 

F(S) = (s-l)(s 2 + l)’ 

by (a) expressing F(s) as partial fractions and inverting the constituent 
parts and (b) using the convolution theorem. 

6.8 Evaluate the integral (6.19), and hence verify that (6.20) holds. 

6.9 Consider the solution of the diffusion equation 

= d 2 (\> 

dt dy 2 

for t > 0 and y G [0, L], subject to the initial conditions <j>(y, 0) = 0 and 
the boundary conditions </>(0 ,t) = < p(L,t ) = 1. Show that the Laplace 
transform of the solution is 

where k = yj s/ D. By expanding the denominator, show that the solution 
can be written as 

»"=|-''h(sl)-*(ir)] 

6.10 (a) Show that a small displacement, y{x,t ), of a uniform string with 

constant tension T and line density p subject to a uniform gravita- 
tional acceleration, g , downwards satisfies 

d 2 y _ 2 &y_ _ 

8t 2 ~ ° 8x 2 9l 

where c = \fTJp (see Section 3.9.1). 

(b) Such a string is semi-infinite in extent, and has y = yt — 0 for x ^ 0 
when t — 0. Use Laplace transforms to determine the solution when 
the string is fixed at x = 0, satisfies y x — > 0 as x — » oo and is allowed 
to fall under gravity. Sketch the solution, and explain how and why 
the qualitative form of the solution depends upon the sign of x — ct. 

6.11 Project The voltage, v(t), in an RLC circuit with implied current i(t) is 
given by the solution of the differential equation 

„d 2 v 1 dv v di 

dt 2 R dt L dt.' 

where C is the capacitance, R the resistance and L the inductance. 

(a) Solve this equation using Laplace transforms when i[t) = H(t— 1) — 
H(t). 

(b) Write a MATLAB script that inverts the Laplace transform of this 
equation directly, so that it works when i(t) is supplied by an input 
routine input . m. 
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(c) Compare your results for various values of the parameters R, L and 

C. 

(d) Extend this further to consider what happens when the implied 
current is periodic. 



CHAPTER SEVEN 


Classification, Properties and Complex Variable 
Methods for Second Order Partial Differential 
Equations 


In this chapter, we will consider a class of partial differential equations, of which 
Laplace’s equation, the diffusion equation and the wave equation are the canonical 
examples. We will discuss what sort of boundary conditions are appropriate, and 
why solutions of these equations have their distinctive properties. We will also 
demonstrate that complex variable methods are powerful tools for solving bound- 
ary value problems for Laplace’s equation, with particular application to certain 
problems in fluid mechanics. 


7.1 Classification and Properties of Linear, Second Order Partial 
Differential Equations in Two Independent Variables 

Consider a second order linear partial differential equation in two independent 
variables, which we can write as 

a( x ,,) 9 ^+2b( x ,y)^y+c( x ,y) d ^ 

(7.1) 

c)cj) dcj) 

+dl ^ X,y ^^x + + d ^ X ’ y ^ = /(*»»)■ 

Equations of this type arise frequently in mathematical modelling, as we have al- 
ready seen. We will show that the first three terms of (7.1) allow us to classify the 
equation into one of three distinct types: elliptic, for example Laplace’s equation, 
(2.13), parabolic, for example the diffusion equation, (2.12), or hyperbolic, for 
example the wave equation, (3.39). Each of these types of equation has distinctive 
properties. These mathematical properties are related to the physical properties of 
the system that the equation models. 


7.1.1 Classification 

We would like to know about properties of (7.1) that are unchanged by an 
invertible change of coordinates, since these must be of fundamental significance, 
and not just a result of our choice of coordinate system. We can write this change 
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of coordinates asf 

(x,y) !->■ (£(x,y),r]{x,y)), with ^ ( 7 - 2 ) 

In particular, if (7.1) is a model for a physical system, a change of coordinates 
should not affect its qualitative behaviour. Writing <j>(x,y) = and using 

subscripts to denote partial derivatives, we find that 

— Px'P^ A bxyPrj : 4^ xx — 4x by £ T ^^xVx'^Cv A Vx'&VW Cxx'by T Vxx'^Prj - 
and similarly for the other derivatives. Substituting these into (7.1) gives us 

Man + 2 Bfar, + Cipr/r, + &i(£> if)^ + 6 2 (£, v)4>ri + h(Z, v)^ = 9 (£, v ) . (7-3) 

where 

A = a£ x + 2b£ x £y + c ?y) 


B = a£ x r) x + b (r] x £ y + £ x r) y ) + c£ y ii y , 


(7.4) 


C = aril + ZbilxVy + afy 


We do not need to consider the other coefficient functions here. We can express 
(7.4) in a concise matrix form as 


( A B \ = ( ^ T,x ) ( a b ) f tx tv \ 

\ B C ) V ty Vy ) V b c ) \ Vx Vy ) ’ 


which shows that 


det 


A B 
B C 


= det 


a 

b 


b 

c 


( d&rn y 

\d{x,y)J 


(7.5) 


(7.6) 


This shows that the sign of ac—b 2 is independent of the choice of coordinate system, 
which allows us to classify the equation. 


An elliptic equation has ac > 6 2 , for example, Laplace’s equation 

d 2 (p d 2 <t> = n 

dx 2 dy 2 

A parabolic equation has ac = b 2 , for example, the diffusion equation 

K —t — — = 0. 
dx 2 dy 


f Note that 


9 (€, 9 ) 

d{x,y) 


= det 


£x 

Vx 


T)y 


is the Jacobian of the transformation. The Jacobian is the factor by which the transformation 
changes infinitesimal volume elements. 
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A hyperbolic equation has ac < b 2 , for example, the wave equation 

1 d 2 (j> d 2 4> 
c 2 dx 2 dy 2 

Note that, although these three examples are of the given type throughout the 
(a:, y)-plane, equations of mixed type are possible. For example, Tricomi’s equa- 
tion, (j) xx = X(j) y y , is elliptic for x < 0 and hyperbolic for x > 0. 


7.1.2 Canonical Forms 

Any equation of the form given by (7.1) can be written in canonical form 
by choosing the canonical coordinate system, in terms of which the second 
derivatives appear in the simplest possible way. 

Hyperbolic Equations: ac < b 2 
In this case, we can factorize A and C to give 

A = a£ x + 2 b£ x £y + c£ y = (pi£,x + <Zl£y) (P2^,x + Q2^,y) t 


C — arj 2 + 2br] x r)y + erf = {pip x + qip y ) ( p 2 r] x + <72%) , 

with the two factors not multiples of each other. We can then choose £ and ?y so 
that 


Pl€x + qity = P2Vx + q2Vy = 0, 

and hence A = C = 0. This means that 

dy qi . , dy q 2 

£ is constant on curves with — = — , 77 is constant on curves with — - = — . 

dx pi dx P 2 

We can therefore write pidy — q±dx = P 2 dy — q 2 dx = 0, and hence 

(pidy - qidx) (p 2 dy - q 2 dx) = 0, 


which gives 


a dy 2 — 2b dx dy + c dx 2 = 0. 


(7.7) 


As we shall see, this is the easiest equation to use to determine (£, 77 ). We call 
(£, 77 ) the characteristic coordinate system, in terms of which (7.1) takes its 
canonical form 


+ 6a(£, + & 3 (£, = 9 (£, V) ■ (7-8) 

The curves where £ is constant and the curves where rj is constant are called the 
characteristic curves, or simply characteristics. As we shall see, it is the 
existence, or nonexistence, of characteristic curves for the three types of equation 
that determines the distinctive properties of their solutions. We discussed the 
reduction of the wave equation to this canonical form in Section 3.9.1. 

As a less trivial example, consider the hyperbolic equation 

$XX Sedl X (f)yy — 0. 


177 


(7.9) 
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Equation (7.7) shows that the characteristics are given by 

dy 2 — sech 4 x dx 2 = (dy + sech 2 x dx) ( dy — sech 2 x dx ) = 0, 

and hence 

^ = ±sech 2 x. 
dx 

The characteristics are therefore y ± tanh x = constant, and the characteristic 
coordinates are £ = y + tanh x, y = y — tanh a;. On writing (7.9) in terms of these 
variables, with <fi(x,y ) = ip(£,y), we find that its canonical form is 


^ri = 


(v - 0 ~ i’v) 

4 -^- r ?) 2 ’ 


(7.10) 


in the domain {y — £) < 4. 


Parabolic Equations: ac = b 2 

In this case, 

A — a ix + + c£ 2 = (p£ x + g£ y ) , 

C = aril + 2br l xVy + c? 7y = (PVx + 9V yf , 

so we can only construct one set of characteristic curves. We therefore take £ to be 
constant on the curves pdy — qdx = 0. This gives us A = 0 and, since AC = B 2 , 
B = 0. For any set of curves where y is constant that is never parallel to the 
characteristics, C does not vanish, and the canonical form is 

i’m + 7 ?)^ + & 2 (£, v)i>v + v)*P = 9 (£, v) • (7-U) 

We can now see that the diffusion equation is in canonical form. 

As a further example, consider the parabolic equation 

ipxx + 2cosec y <j> xy + cosec 2 y (j) yy = 0. (7.12) 

The characteristic curves satisfy 

dy 2 — 2cosec ydxdy + cosec 2 y dx 2 = ( dy — cosec y dx) 2 = 0, 

and hence 

dy 

— = cosec y. 
dx 

The characteristic curves are therefore given by x + cos y = constant, and we can 
take £ = x + cos y as the characteristic coordinate. A suitable choice for the 
other coordinate is y = y. On writing (7.12) in terms of these variables, with 
c t>(x,y ) = if>($,,y), we find that its canonical form is 

ip m = sin 2 y cos y , (7.13) 


in the whole (£, ? 7 )-plane. 
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Elliptic Equations: ac > b 2 

In this case, we can make neither A nor C zero, since no real characteristic curves 
exist. Instead, we can simplify by making A = C and B = 0, so that the second 
derivatives form the Laplacian, V 2 ^, and the canonical form is 

+ Vv, + v)i>t + hit, iMv + v)i> = 9 (£, n) ■ (7-14) 

Clearly, Laplace’s equation is in canonical form. 

In order to proceed, we must solve 

A - C = a (£ 2 - r)l) + 2b (£ x £ y - rj x r) y ) + c (£ 2 - r? 2 ) = 0, 


B = a£ x r] x + b (r] x ^ v + £ x r] y ) + c^ v y v = 0 . 


We can do this by defining x = £ + iy, and noting that these two equations form 
the real and imaginary parts of 

axl + ZbxxXy + cxl = 0 , 


and hence that 

Xx ~b ± W ac — b 2 
Xy a 

Now x is constant on curves given by Xy dy + Xx dx 
on 


(7.15) 

0, and hence, from (7.15), 


dy 

dx 


b iyj ac — b 2 


(7.16) 


By solving (7.16) we can deduce £ and 77 . For example, consider the elliptic equation 


+ seclia; <f> yy = 0 . 


(7.17) 


In this case, x = £ + is constant on the curves given by 

= ±7 sech 2 a;, 
dx 

and hence y i tanli x = constant. We can therefore take x = V + i tanhx, and 
hence £ = y, 77 = tanli x. On writing (7.17) in terms of these variables, with 
cf>(x,y) = we find that its canonical form is 

V’li+VV) = (7.18) 

in the domain 77 < 1 . 

We can now describe some of the properties of the three different types of equa- 
tion. For more detailed information, the reader is referred to Kevorkian (1990) and 
Carrier and Pearson (1988). 



180 


CLASSIFICATION AND COMPLEX VARIABLE METHODS 


7.1.3 Properties of Hyperbolic Equations 

Hyperbolic equations are distinguished by the existence of two sets of charac- 
teristics. This allows us to establish two key properties. Firstly, characteristics 
are carriers of information. For the wave equation, we saw in Section 3.9.1 that 
solutions propagate at speeds ±c, which corresponds to propagation on the char- 
acteristic curves, x ± ct = constant. Indeed, the use of the independent variable 
t, time, instead of y suggests that the equation has an evolutionary nature. More 
specifically, consider the Cauchy problem for a hyperbolic equation of the form 
(7.1). In a Cauchy problem, we specify a curve C in the (x, y)-plane upon which we 
know the Cauchy data; (j) and the derivative of <j> normal to C, d(f>/dn. The initial 
value problem for the wave equation that we studied in Section 3.9.1 is a Cauchy 
problem. Does this problem have a solution in general? Let’s assume that (j> and 
d(f>/dn on C can be expanded as power series, and that the functions that appear 
as coefficients in (7.1) can also be expanded as power series in the neighbourhood 
of C . We can then write 

</>(£, V) = </Kft 0) + (ft °) + ^ (ft °) H — , 

where the orthogonal coordinate system (£, rj) is set up so that 77 = 0 is the curve 
C, which is then parameterized by £. As we have seen, we can write (7.1) in terms 
of this new coordinate system as (7.3). We know <(>(£, 0) and the normal derivative, 
d<fi/dr](£,,0), and, provided that the coefficient of d 2 <j)/di) 2 does not vanish on C, 
we can deduce d 2 <p / dp 2 , 0) from (7.3). When does the coefficient of 8 2 (j)/dy 2 
vanish on Cl Precisely when C is a characteristic curve. Provided that C is not 
a characteristic curve, we can also deduce higher derivatives from derivatives of 
(7.3), and hence construct a power series solution, valid in the neighbourhood of 
C . The formal statement of this informally presented procedure is the Cauchy— 
Kowalewski theorem, a local existence theorem (see Garabedian, 1964, for a 
formal statement and proof). The effects of the initial data propagate into the 
(x, j/)-plane on the characteristics, so it is inconsistent to specify initial data upon a 
characteristic curve. In addition, the solution at any point ( x,y ) is only dependent 
on the initial conditions that lie between the two characteristics through (x,y), 
as shown in Figure 7.1. For the wave equation, this is immediately obvious from 
d’Alembert’s solution, (3.43). 

Secondly, discontinuities in the second derivative of <j> can propagate on charac- 
teristic curves. To see this, consider a curve C in the {x, 7 /)-plane, not necessarily 
a characteristic, given by f{x. y) = £o- Suppose that is not continuous on C, 
but that (j> satisfies the hyperbolic equation on either side of C. Can we choose £ 
in such a way that the equation is satisfied, even though (f>^ (£q~ , 77 ) 7 ^ (^ , 77 ) ? 

If we evaluate the equation on either side of the curve and subtract, we find that 

rf) {<% (A , v) - hi , v) } = °- 

We can therefore satisfy the equation if A(£o, if) = 0, an d hence C is a characteristic 
curve. We conclude that discontinuities in the second derivative can propagate on 
characteristic curves. In general, and its first derivatives are continuous, but for 
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Fig. 7.1. The domain of dependence of the solution of a hyperbolic equation. The solution 
at (x, y) depends only upon the initial data on the marked part of the initial line, C. 


the wave equation, in which only second derivatives appear, discontinuities in (/> 
and its derivatives can also propagate on characteristics, as shown in Figure 3.10. 


7.1.4 Properties of Elliptic Equations 

For elliptic equations, A and C are never zero, and there are no characteristics. 
Solutions are therefore infinitely-differentiable. Moreover, there is no timelike vari- 
able; for example, Laplace’s equation is written in terms of the spatial variables, x 
and y. For physical problems that can be modelled using elliptic equations, bound- 
ary value problems, rather than initial value problems, usually arise naturally; for 
example the steady state diffusion problem discussed in Section 2.6.1. In fact, the 
solution of the Cauchy problem for elliptic equations does not depend continuously 
on the initial conditions, and is therefore not a sensible representation of any real, 
physical problem. f We can easily demonstrate this for Laplace’s equation. 
Consider the Cauchy, or initial value, problem 

<p xx + (j>tt — 0 for t ^ 0, -oo < x < oo, (7-19) 

subject to 

0) = 4>o{x), 4>t{x,0) = V 0 {x). (7.20) 


f We say that the problem is ill-posed. 
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If <fio = 0 and vq = y sin Ax, a solution is </> = cf(x, t ), where 




1 

A2 


sin Ax sinh A t. 


As A — > oo, (j> t (x, 0) = j sin Ax — > 0. However, for t > 0, max \<f\ = yy sinh At — > oo 
as A — > oo. The smaller the initial condition, the larger the solution. Now, let 
<J>(x,t) be the general solution of (7.19) subject to (7.20). Consider the function 
<f> = <F + </). The larger the value of A, the closer 4> t (x, 0) is to u 0 (x). However, |$| 
increases without bound for t > 0 as A increases. An arbitrarily small change in 
the boundary data produces an arbitrarily large change in the solution. 

Laplace’s equation is the simplest possible elliptic equation, and, as we have 
seen, arises in many different physical contexts. Let’s consider some more of its 
properties. 


Theorem 7.1 (The maximum principle for Laplace’s equation) Let D be 

a connected, bounded, open set in two or three dimensions, and <f a solution of 
Laplace’s equation in D. Then <f attains its maximum and minimum values on 
dD, the boundary of D, and nowhere in the interior of D, unless <f is a constant. 


Proof The idea of the proof is straightforward. For example in two dimensions, at 
a local maximum in the interior of D, <p x = (f y = 0, (f xx ^ 0 and <f yy ^0. At a 
maximum with <f xx < 0 or cf yy < 0, we have V 2 0 = <p xx + (f> vy < 0, and <f> cannot 
be a solution of Laplace’s equation. However, it is possible to have <p xx = f> yy = 0 
at an interior local maximum, for example, if the Taylor series expansion close to 
(x, y) = (xo, 2 / 0 ) is 4> = (x — Xo) 4 + (y — yo) 4 + • • • . Although this is clearly not a 
solution of Laplace’s equation, we do need to do a little work in order to exclude 
this possibility in general. 

Let if = 4> + e|x| 2 , with e > 0 and |x| 2 = x 2 + y 2 in two dimensions and 
|x| 2 = x 2 + y 2 + z 2 in three dimensions. We then have 

V 2 ^ = V 2 </> + eV 2 |x| 2 = ke > 0, 

where k = 4 in two dimensions and k = 6 in three dimensions. Since V 2, i/’ ^ 0 at an 
interior local maximum, we conclude that if has no local maximum in the interior 
of D, and hence that if must attain its maximum value on the boundary of D , say 
at x = xo- But, by definition 

<£(x) < if{x) < if(x o) = </>(x 0 ) + e l x °| 2 ^ ma x^ + el 2 , 

dD 

where l is the greatest distance from dD to the origin. Since this is true for all 
e > 0, we can make e arbitrarily small, and conclude that 

<f(x) ^ max <f for all x £ D. 


Similarly, —<p, which also satisfies Laplace’s equation, attains its maximum value 
on dD, and hence </> attains its minimum value on dD. □ 
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We can use the maximum principle to show that the Dirichlet problem for 
Laplace’s equation has a unique solution. 


Theorem 7.2 The Dirichlet problem 

V 2 0 = 0 for x £ D, 

with 

0(x) = /(x) on dD, 

has a unique solution. 


(7.21) 

(7.22) 


Proof Let 0i and 0 2 be solutions of (7.21) subject to (7.22). Then 0 = 0i — 02 
satisfies 

V 2 0 = 0 for x e D, (7.23) 

with 

0(x) = 0 on dD. (7.24) 

By Theorem 7.1, ip attains its maximum and minimum values of dD , so that 
0 ^ 0(x) ^ 0 for x € D, and hence 0 = 0. This means that (pi = 02, and hence 
that there is unique solution. □ 


The maximum principle can also be used to study other elliptic equations. For 
example, consider the boundary value problem 

V 2 0=-1 for x£l= j(z,j/) ^ + ^<lj, (7.25) 

with a > b > 0, subject to 

<p = 0 on dX. (7.26) 

How big is 0(0,0)? We can obtain some bounds on this quantity by transforming 
(7.25) into Laplace’s equation. We define ip = ( x 2 + y 2 )/ 4, so that V 2 0 = 1, and 
let 0 = 0 + 0, so that V 2 0 = 0. We can also see that <p = ip on dX , and that 
0(0,0) = 0(0,0). Theorem 7.1 then shows that 

mimi ^ 0(0,0) ^ maxiA. 
dx dx 


We can therefore bound 0(0, 0) using the maximum and minimum values of ip = 
( x 2 + y 2 )/ 4 on the boundary of X. The maximum value is clearly a 2 /4. Since X 
is symmetric about both coordinate axes, we can find the minimum value of ip by 
determining the value of the radius ro for which the circle x 2 + y 2 = Tq just touches 
the straight line x/a + y/b = 1, which forms the boundary of X in the quadrant 
x > 0, y > 0. A little algebra shows that Tq = a 2 b 2 /{a 2 + b 2 ), and hence that 


a 2 b 2 . . a 2 


Finally, the other type of boundary value problem that often arises for Laplace’s 
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equation is the Neumann problem (see Section 5.4), where the normal derivative of 
(j) is specified on the boundary of the solution domain. Such problems can only be 
solved if the boundary data satisfies a solubility condition. 


Theorem 7.3 A necessary condition for the existence of a 
problem 

solution of the Neumann 


V 2 ^f) = 0 for xeDc R 3 , 

(7.27) 

with 

w~( x ) = <?( x ) ondD, 
on 

(7.28) 

is 

[ </(x) d 2 x = 0. 

J dD 

(7.29) 


Proof If we integrate (7.27) over D and use the divergence theorem and (7.28), we 
find that 


V 2 </>d 3 x = / n.V^d 2 x= / g(x)d 2 x = 0. 


id 


IdD 


I dD 


□ 


This solubility condition has a simple interpretation in terms of inviscid, incom- 
pressible, irrotational fluid flow, which we introduced in Section 2.6.2. In this case, 
the Neumann problem specifies a steady flow in D with the normal velocity of the 
fluid on dD given by c/(x). Since the fluid is incompressible, such a flow can only 
exist if the total flux into D through dD is zero, as expressed by (7.29). 


7.1.5 Properties of Parabolic Equations 

Parabolic equations have just one set of characteristics. For example, for the 
diffusion equation, K(f> xx = <f> t with K > 0, t is constant on the characteristic 
curves. As we have seen, any localized disturbance is therefore felt everywhere in 
—oo < x < oo instantaneously, as we can see from Example 5 of Section 6.4. In 
addition, solutions are infinitely-differentiable with respect to x. We can also prove 
a maximum principle for the diffusion equation. 

Theorem 7.4 (The maximum principle for the diffusion equation) Let (f> 

be a solution of the diffusion equation. Consider the domain 0 ^ x ^ L, 0 ^ t ^ T. 
Then <f> attains its maximum value on x = 0, 0 ^ t ^ T, or x = L, or 

t = 0, 0 < x < L. 


Proof This is very similar to the proof of the maximum principle for Laplace’s 
equation, and we leave it as Exercise 7.8(a). □ 
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We can use this maximum principle to show that the solution of the initial- 
boundary value problem given by 

^ = K^r-t for t > 0, 0 < x < L, (7.30) 

at ox A 

subject to 

4>{x, 0) = <j> o(x) for 0 ^ x ^ L, (7-31) 

and 

0(0, t) = </>(L,t) = fait) for t > 0, (7.32) 

with (f> o, (f > i and 0 2 prescribed functions, is unique. We leave the details as Exer- 
cise 7.8(b). 

We have now seen that parabolic equations have a single set of characteristics, 
and that for a canonical example, the diffusion equation, there is a maximum prin- 
ciple, so that parabolic equations share some of the features of both hyperbolic and 
elliptic equations. We end this section by stating a useful theorem that holds for 
reaction-diffusion equations, which are parabolic. We will meet reaction-diffusion 
equations frequently in Part 2. 


Theorem 7.5 (A comparison theorem for reaction diffusion equations) 

Consider the reaction-diffusion equation 

(ft = KW 2 (t> + /(</>, x, t) for xeD, < > 0, (7.33) 

with K > 0 and f a smooth function. If </>(x, t) is a bounded function that satisfies 

(ft ^ K\7 2 (f + f (</>, x,t) forx&D,t> 0, 

we say that (f is a supersolution of (7.33). Similarly, if <f{x,t) is a bounded function 
that satisfies 

(f t < KX7 2 (f_ + f (</>, x,t) forxeD,t> 0, 

we say that (f is a subsolution of (7.33). If there also exist constants a and ft with 
a 2 + ft 2 0 0, such that 

dd> df> 

a(f — ft— ^ a<f> — /3-r= for x € dD, t > 0, 
on — on 

and 

</>(x, 0) ^ <f(x, 0) for x £ D, 

then 

(f(x, t) ^ </>(x, t) for x G D, t > 0. 

We will not prove this result here (see Grindrod, 1991, for further details). 


To see how this theorem can be used, consider the case / = cf>( 1 — </>), with 
d(f>/dn = 0 on dD. This problem arises in a model for the propagation of chemical 
waves (see Billingham and King, 2001). If 0 ^ 0(x,O) ^ 1, then by taking firstly 
f> = 1, cf> = (f, and secondly (f = (f, <fi = 0, we find that 0 ^ </>(x, t) ^ 1 for t ^ 0. 



186 


CLASSIFICATION AND COMPLEX VARIABLE METHODS 


In other words, the comparison theorem allows us to determine upper and lower 
bounds on the solution (see Exercise 7.9 for another example). 

7.2 Complex Variable Methods for Solving Laplace’s Equation 

In Section 2.6.2 we described how steady, inviscid, incompressible, irrotational fluid 
flow, commonly referred to as ideal fluid flow, past a rigid body can be modelled 
as a boundary value problem for Laplace’s equation, V 2 <(> = 0, in the fluid, with 
d(j)/dn = 0 on the boundary of the body, where <j> is the velocity potential and 
u = V(j) the velocity field. Appropriate conditions at infinity also need to be 
prescribed, for example, a uniform stream. Once <f> is known, we can determine the 
pressure using Bernoulli’s equation, (2.18). In general, solutions of this boundary 
value problem are hard to find, except for the simplest body shapes. However, two- 
dimensional flow is an exception, since powerful complex variable methods can, 
without too much effort, give simple descriptions of the flow past many simply- 
connected body shapes, such as aircraft wing sections, channels with junctions and 
flows with free surfaces (see, for example, Milne-Thompson, 1960). 


7.2.1 The Complex Potential 

Recall from Section 2.6.2 that the velocity in a two-dimensional ideal fluid flow, 
u = (u,v) = ((/> x ,(/>y), satisfies the continuity equation, u x + v y = 0. We can 
therefore introduce a stream function, ip(x,y), defined by u = if> y , v = — if) x , 
which, for sufficiently smooth functions, satisfies the continuity equation identically. 
Elementary calculus shows that the stream function is constant on any streamline 
in the flow, and the change in xf> between any two streamlines is equal to the flux of 
fluid between them. If we now look at the definitions we have for the components 
of velocity, for compatibility we need <f> x = vl> y and </> y = —i/) x . These are the 
Cauchy-Riemann equations for the complex function w = (f> + iip and, with our 
smoothness assumption, imply that w(z) is an analytic function of the complex 
variable z = x + iy in the domain occupied by the fluid. f The quantity w(z) is 
called the complex potential for the flow and, for simple flows, can be found 
easily. We can also see that 


dw i . , . _, 0 

~T = 9x + Wx = u - iv = qe , 
dz 

where q is the magnitude of the velocity and 9 the angle that it makes with the 
x-axis, so that, once we know w, we can easily compute the components of the 
velocity, and vice versa. 

Let’s consider some examples. 


(i) A uniform stream flowing at an angle a to the horizontal has u = U cos a, 
v = U sin a, so that 


dw 

dz 


= u-iv=Ue~ ia , 


f See Appendix 6 for a reminder of some basic ideas in the theory of complex variables. 
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and hence w = Ue~ la z. 

(ii) A point vortex is a system of concentric circular streamlines centred on 
z = zq, with complex potential 

W=^\og(z-z 0 ). 

If we now define the circulation to be J { , u-dr, where C is any closed contour 
that encloses z = zq, the point vortex has circulation n. By taking real and 
imaginary parts, we can see that </> = —k9/2tt, ip = «;logr/27r, where r is 
the polar coordinate centred on z = Zq. The streamlines are therefore given 
by ip = constant, and hence r = constant, a family of concentric circles, as 
expected. 

(iii) A point source of fluid at z = Zq ejects m units of fluid per unit area per 
unit time. Its complex potential is 

Yfi 

w = — log( 2 - 2 0 ), 

for a source at z = zq. The streamlines are straight lines passing through 
z = zq and, since 

dw in 1 

dz 2n z — Zq 

we have that q = m/27rr, where r is the polar coordinate centred at z = Zq, 
and that the flow is purely radial. It is simple to show that the flow through 
any closed curve, C, that encloses Zq is 


as expected. 


u • n dl 


/ c 


m, 


A word of warning is required here. The first of these complex potentials is analytic 
over the whole 2 -plane. The second and third fail to be analytic at 2 = zq. In reality, 
we need to invoke some extra physics close to this point. For example, any real 
source of fluid will be of finite size, and, close to 2 = 2 o, we need to take this into 
account. For a point vortex, close to the singularity the effect of viscosity becomes 
important, and we need to include it. In each case, this can be done by defining 
an inner asymptotic region, using the methods that we will discuss in Chapter 12. 
Alternatively, if there are no sources or vortices in the flow domain that we are 
considering, but we wish to use sources and vortices to model the flow, the points 
where the complex potential is not analytic must be excluded from the flow domain. 
We will see an example of this in the next section. 


7.2.2 Simple Flows Around Blunt Bodies 

Since Laplace’s equation is linear, complex potentials can be added together to 
produce more complicated fluid flow fields. For example, consider a uniform stream 
of magnitude XJ flowing parallel to the x-axis and a source of strength 27 t m situated 
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on the x-axis at z = a. The complex potential is w = Uz + mlog(z 
we can deduce that 

ip = Uy + mtan -1 



a ) , from which 


(7.34) 


This is an odd function of y, so that the velocity field associated with it is an even 
function of y , as we would expect from the symmetry of the flow. The x-axis ( y = 0) 
is a streamline with ip = 0, whilst for x > 0 there is another streamline with ip = 0, 
given by 


y = {a 



(7.35) 


which is a blunt-nosed curve. The streamlines are shown in Figure 7.2. As there is 
no flow across a streamline, we can replace it with the surface of a rigid body without 
affecting the flow pattern. The complex potential (7.34) therefore represents the 
flow past a blunt body of the form given by (7.35). At the point z = a — m/U, 
dw/dz = 0, and hence both components of the velocity vanish there. This is called 
a stagnation point of the flow. Note that the singularity due to the point source 
does not lie within the flow domain, and can be disregarded. 



Fig. 7.2. The flow past a blunt body given by the stream function (7.34) with U = m = 
o = l. There is a stagnation point at the origin. 


Adding potentials is one way to construct ideal fluid flows around rigid bodies. 
However, since it is an inverse method (for a given flow field, we can introduce a 
rigid body bounded by any streamline), it has its drawbacks. We would prefer a 
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direct method, whereby we can calculate a stream function for a given rigid body. 
Before we describe such a method, we need the circle theorem. 

Theorem 7.6 (The circle theorem) If w = f(z) is a complex potential for a 
flow with no singularities in the domain \z\ ^ a, then w = f(z) + f*(ar/z) is a 
complex potential for the same flow obstructed by a circle of radius a centred at the 
origin. 


Proof Firstly, since f(z) is analytic in \z\ < a, f*(a 2 /z) is analytic in \z\ > a, as 
a 2 / z is just an inversion of the point z. Secondly, on the boundary of the circle, 
^ = ae l 9 with 0 < 9 ^ 27 r, so that f(z) + f*(a 2 /z) = f(ae 10 ) + f*(ae~ l0 ), which, as 
it is the sum of complex conjugate functions, is real. Therefore, ip = 0 on z = ae 10 , 
and the boundary of the circle is a streamline. Finally, since w ~ f(z) + /*( 0) as 
\z\ —> oo, the flow in the far field is given by f(z). □ 


We can illustrate this theorem by finding the complex potential for a uniform 
stream of strength U flowing past a circle of radius a. The complex potential for the 
stream is w = Ue~' la z, which has no singularities in |z| ^ a. By the circle theorem, 
the complex potential for the flow when the circular boundary is introduced is 

a 2 

w = Ue~ ia z + Ue ia — . 

z 


When a = 0, the flow is symmetric about the x-axis, with 


d fl = u 11-^ 

az 


and dw/dz ~ U as \z\ — > oo, as expected. By taking real and imaginary parts, we 
can recover the velocity field, potential and stream function, which we discussed in 
Section 2.6.2. Before leaving this problem, we note that the complex potential is 
not unique, since we can add a point vortex, centred upon z = 0, and still have the 
circle as a streamline, without introducing a singularity in the flow domain. When 
a = 0, this gives the potential 


w = U 



in 

2ir 


log z. 


(7.36) 


The introduction of this nonzero circulation about the circle has important conse- 
quences, as it breaks the symmetry (see Figure 7.3), and, as we shall now show, 
causes a nonzero lift force on the body. 

In order to calculate the force exerted by an ideal fluid flow upon a rigid body, 
consider an infinitesimal element of arc on the surface of the body, ds. Let the 
tangent to this element of arc make an angle 9 with the x-axis. The force on this 
infinitesimal arc is due to the pressure, and given by 

—p sin 6 ds + ip cos 6 ds = ipe 10 ds. 
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K = 0 K = 20 



Fig. 7.3. Flow past a circle, with no circulation and with k = 20. 
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Since the boundary of the body is a streamline, dw* = dw , so that dw* = (dw/dz)dz, 
and hence 

= U37) 

In this form, the force can usually be evaluated using the residue theorem. For the 
example of flow around a circle with circulation k, we have 


F x - iF y = -pi 






dz. 


After evaluating the residue, which is just due to the simple pole, we find that 
F x — iFy = — ipUhi , so that there is a vertical force of magnitude pUn , which tries 
to lift the body in the direction of increasing y. This lift force arises from the 
pressure distribution around the circle, with pressures at the lower surface of the 
body higher than those on the upper surface. As we shall see in Example 4 in the 
next section, it is also the circulation around an aerofoil that provides the lift. 


7.2.3 Conformal Transformations 

Suppose we have a flow in the 2-plane past a body whose shape cannot imme- 
diately be seen to be the streamline of a simple flow. If we can transform this 
problem into a new plane, the ((-plane, where the shape of the body is simpler, 
such as a half-plane or a circle, we may be able to solve for the flow in the ((-plane, 
and then invert the transformation to get the flow in the original, 2-plane. Specif- 
ically, if we seek w = w(z), the complex potential in the 2-plane, and we have a 
transformation, £ = f(z), which maps the surface of the rigid body, and the flow 
outside it, onto a flow outside a half-plane or circle in the ((-plane, then, by defining 
W (£) = w(z), we have a correspondence between flow and geometry in both planes. 
Any streamline in the 2-plane transforms to a streamline in the ((-plane because of 
this correspondence in complex potentials. The complex velocity is 

dw dw d£ dw . 

dz d( dz dC, 

If |/ , (2)| is bounded and nonzero, except perhaps for isolated points on the bound- 
ary of the domain to be transformed, we say that the transformation between the 
2- and ((-planes is conformal, and a unique inverse transformation can be defined. 
Conformal mappings are so called because they also preserve the angle between 
line segments except at any isolated points where f'(z) is zero or infinity (see Ex- 
ercise 7.10). A consequence of this is that if we want to map a domain whose 
boundary has corners to a domain with a smooth boundary, we must have f'(z) 
equal to zero or infinity at these corner points. Such points can cause difficulties. 
For example, if | /' (2) | = 00, we could induce an infinite velocity in the 2-plane, 
which is unphysical. 

Although beautifully simple in principle, there is a practical difficulty with this 
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method: how can we construct the transformation, f(z)7 Fortunately, dictionar- 
ies of common conformal transformations are available (see, for example, Milne- 
Thompson, 1952). Let’s try to make things clearer with some examples. 


Example 1: Mapping a three- quarter-plane onto a half-plane 

Consider the three-quarter-plane that lies in the z-plane shown in Figure 7.4(a). 
How can we map this onto the half-plane in the ((-plane shown in Figure 7.4(b)? 
Let’s try the transformation / = Az n , so that arg/ = arg A + nargz. On BC, 
argz = — 7t/ 2, and we require B'C', the image of BC, to have arg/ = 0, which 
shows that arg A = mr/2. On AB, arg z = 7r, and we require A'B', the image 
of AB, to have arg/ = tt. This means that 7 r = arg A + mr and hence n = 2/3, 
arg A = 7 t/ 3. This means that the family of transformations / = |A|e*’"’/ 3 z 2 / 3 will 
map the three-quarter-plane to the half-plane. If we further require that the image 
of z = —1 should be / = —1, then |A| = 1, and / = e r7r / 3 2 2//3 . Note that this 
transformation is not conformal at z = 0, since the angle of the boundary changes 
there. 


z-plane 

A 

(b) 

,y 

B A' 

X 

B' c' 

///////// // 


//////////*■ 

/ 



/ 



/ 




/-plane 


Fig. 7.4. Mapping a three-quarter-plane to a half-plane. 


Example 2: Mapping a strip onto a half-plane 

Consider the strip of width 2 that lies in the z-plane shown in Figure 7.5(a). How 
can we map this onto the half-plane in the /'-plane shown in Figure 7.5(b)? Let’s 
try the transformation z = K log / + L. On A'B', arg / = 0, so if / = / + irj, 
z = A' log |/| + L. If K is real, the choice L = — i makes 2 range from — i — oo to 
— i+oo, as required. On C'D', arg / = tt, so z = A" log \£\+iirK—i. If we now choose 
iirK = 2 i, then z ranges from i — oo to i + oo, as required. The transformation is 
therefore 


*= 2 log /-*, / = *e 7r2 / 2 , 

7T 

which is conformal in the finite z-plane. 
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(a) 


D 

/////////// 

, y 

c 

/////////// 


/ 


-/ 

////////// 

f /////// / 

A 

B 


z-plane 


x 


(b) 


C' D' 


,,il 

A B’ 


$ 


£-plane 


Fig. 7.5. Mapping a strip to a half-plane. 


We now pause to note that, in both of these examples, we can use the conformal 
mapping to solve the Dirichlet problem for Laplace’s equation, in the three-quarter- 
plane and strip respectively, by mapping to the half-plane. We can solve Laplace’s 
equation in the half-plane using the formula (5.30), which we derived using Fourier 
transforms. We can then easily write down the result in the z-plane using the 
inverse transformation. To justify this, note that if </>(x,y) = we 

can differentiate to show that 

d 2 (j) d 2 (j) dC, 
dx 2 dy 2 dz 

so that, provided C, = f(z) is conformal, a function that is harmonic in the (£, rf)- 
plane is harmonic in the {x, y)-plane, and vice versa (see Exercise 7.11). 

We can illustrate this by solving a Dirichlet problem for Laplace’s equation in a 
strip, namely 

V 2 0 = 0 for —oo < x < oo and — 1 < y < 1, (7.39) 

subject to 

cj)( x, —1) = 0, <f>(x, 1) = e~ x for — oo < x < oo. (7.40) 



We know that the transformation that maps this strip in the z = x + iy-plane to 
the upper-half-plane in the ( = £ + i?7-plane is ( = *e 7rz / 2 , and hence 

7T yj , ?y = e”^ 2 cos 

In the C-pkme, (7.39) and (7.40) become 

<9 2 4> d 2 $ 

-^r— tt- + vrw = 0 for —oo < £ < oo and 0 < 77 < 00, 
a ^ or] z 

subject to 




0 ) 


0 for 0 < £ < 00, 

ex p{-^l°g 2 (“6} for —00 < £ < 0, 



194 


CLASSIFICATION AND COMPLEX VARIABLE METHODS 


and $ — > 0 as £ 2 + rfl — > oo. The solution, given by (5.30), is 





? 7 exp{-^log 2 (-s)} 
(£ - s) 2 + v 2 


and hence 


(t>{x,y) 


1 A 0 e™' 2 cos (| ?ry) exp { — ^ log 2 (-s) } ^ 

^ Js=-oo |e 7rx / 2 sin (g7n/) + s} 2 + e 17 * cos 2 (\ny) 


Example 3: Flow past a flat plate 

Consider a plate of length 2a positioned perpendicular to a uniform flow with speed 
U, as shown in Figure 7.6(a) in the 0 -plane. We want to map the exterior of the 
plate to the exterior of a circle in the ((-plane, as shown in Figure 7.6(b). This can 
be done using 

*=H c+ ?)- (7 - 4i) 

To show this, we write the surface of the circle as £ = ae l6 , so that 0 = ia cos 9. 
The points labelled in Figure 7.6(a) in the 0 -plane then map to the corresponding 
points labelled in Figure 7.6(b). Following the sequence A'B'C'D'E', the fluid is on 
the right hand side in the ((-plane, and consequently is on the right hand side in 
the 0 -plane, so that the outside of the circle maps to the outside of the plate. As 
|<(| — > oo, 0 ~ so that 


dw . dw 

~d ^ ~ ~ 2l ~dfl' 


Since we require that dw/dz ~ U, we must have d/w / dQ ~ Ui/2, so that the stream 
at infinity in the ((-plane is half the strength of the one in the 0 -plane, and has been 
rotated through tt/2 radians. 


(a) 

E 

•/a 

A 

D 

B 


« -ia 


C 


z-plane 


(b) 



Fig. 7.6. Mapping a flat plate to a unit circle. 
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We can now solve the flow problem in the (-plane using the circle theorem, to 
give 

w(C) = X -Ue™!\ + l -Ue~^/ 2 j. (7.42) 

Next, we need to write this potential in terms of z. From (7.41), 

C 2 --C + a 2 =0, 

l 

and hence 

( = —i (\/ z 2 + a 2 + z'J , 

choosing this root so that |(| — > oo as \z\ — > oo. From the potential (7.42), we find 
that 

w = \u (\Zz 2 + a 2 + z+ a . V (7.43) 

2 V V2 +« +2/ 

Although this result is correct, it does give rise to infinite velocities at z = ±ia, the 
ends of the plate, which are not physical, and arise because of the sharpness of the 
boundary at these points. In reality, viscosity will become important close to these 
points, and lead to finite velocities. 

Example f •' Flow past an aerofoil 
As we have seen, the transformation 



maps the exterior of a unit circle in the ((-plane to the exterior of a flat plate in 
the z-plane. In contrast, the exterior of circles of radius a > 1 whose centres are 
not at the origin, but which still pass through the point ( = 1 and enclose the 
point ( = — 1, are mapped to shapes, known as Joukowski profiles, that look 
rather like aerofoils. These shapes have two distinguishing characteristics: a blunt 
nose, or leading edge, and a sharp tail, or trailing edge, at 2 = ( = 1, as shown in 
Figure 7.7. 

Let’s now consider the flow of a stream of strength U, at an angle a to the hori- 
zontal, around a Joukowski profile. Since dz/dQ = 0 at ( = 1, the transformation is 
not conformal at the trailing edge, and the velocity will be infinite there, unless we 
can make the flow in the transform plane have a stagnation point there. This can 
be achieved by including an appropriate circulation around the aerofoil. We must 
choose the strength of the circulation to make the velocity finite at the trailing 
edge. This is called the Kutta condition on the flow. 

If the centre of the circle in the ((-plane is at <( = ( c , we must begin by making 
another conformal transformation, ( = (( — Cc) / a i so that we can use the circle 
theorem. The flow in the ((-plane is a uniform stream and a point vortex, so that 

w(0 = Ve-^C + Ve^i + ^ log (. 

C Z7T 
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?c= 025 




C c = 0.25(-1+i) 




Fig. 7.7. Two examples of Joukowski profiles in the z-plane, and the corresponding circles 
in the ("-plane, where ( = ( + *»? and z = x + iy. 


To determine V, we note that 


" - K 0< + + «7T<c) ’ 

and hence that z ~ a(/2 as |(| — > oo, so that we need Ue~ la = 2 Ve~ 1 ^ /a, and 
hence V = all/ 2 and f3 = a. The complex potential is therefore 


w«) = 2 aU 



in 

2ir 


logC, 


and hence 


dw 

dc 




— e 



IK 


(7.45) 


(7.46) 


We can therefore make the trailing edge, ( = (1 — ( c ) /a = ( s , a stagnation point 
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by choosing 


k = inaU ( e *“£s — e l<x — ) . 

Qs 


(7.47) 


This expression is real, because |£ s | = 1 and hence the quantities e~ za ( s and e za /( s 
are complex conjugate. 

In order to calculate the force on the Joukowski profile, we need to evaluate 




1.7 ( dw\ d( - 

2 J |c|=i \dC J dz 



{lU(e- ia -e ia /?)+i K /27ra(} 2 

Hi-i/K + Cc) 2 } 


This integral can be evaluated using residue calculus. Since we know that the 
integrand is analytic for |(j > 1, we can evaluate the residue at the point at infinity 
by making the transformation £ = 1/s and evaluating the residue at s = 0. We 
find, after some algebra, that 


F x -iF y = -ipnUe~ ia , 

or equivalently, 

F x +iF y = pnUe i( ' 7r/2+a) . 

From this, we can see that the lift force on the aerofoil is directed perpendicular to 
the incoming stream. The streamlines for the flow around the aerofoils illustrated 
in Figure 7.7 are shown in Figure 7.8 with a = 7r/4. 


Exercises 

7.1 Find and sketch the regions in the (x, y)-plane where the equation 

(1 + X)(j) xx T 2 Xytfixy y (j)yy = 0 

is elliptic, parabolic and hyperbolic. 

7.2 Determine the type of each of the equations 

(a) 4*XX 4” = 0, 

(b) (frxx 2 (j)xy 4" 4“ 24 (f)y T 5 (f) — 0, 

(c) (j) xx + §y<t>x V + 9y 2 4> yy 4- 40 = o, 

and reduce them to canonical form. 

7.3 Determine the characteristics of Tricomi’s equation, <j> xx = x4> yy , in the 
hyperbolic region, x > 0. Transform the equation to canonical form in (a) 
the hyperbolic region and (b) the elliptic region. 

7.4 Show that any solution, (j){x,t ), of the one-dimensional wave equation, 
(3.39), satisfies the difference equation i/i — 2 / 2 4- 2/3 — 2/4 = 0, where 3 / 1 , 7/2 , 2/3 
and 2/4 are the values of the solution at successive corners of any quadrilat- 
eral whose edges are characteristics. 
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7.6 


7.7 


7.8 

7.9 


0(0, t) = \d 2 


for t ^ 0. 


By differentiating the canonical form of the equation, obtain an equation 
for the jump in a second derivative of 0 across the characteristic curve 
x = ct. By solving this equation, show that the jump in <f> tt across x = ct 
is e~ kcx . 

Consider the boundary value problem given by (7.25) and (7.26). By using 
suitable functions 0 = Ax 2 + By 2 that satisfy V 2 0 = 1 with A > 0 and 
B > 0, show that 


a 2 b 2 

2 (a + b) 2 


0 0 ( 0 , 0 ) 0 


a 2 b 2 

2(a 2 + b' 2 ) ' 


The function 0(r, 9) satisfies Laplace’s equation in region a < r < 5, where 
r and 6 are polar coordinates and a and b are positive constants. Using 
the maximum principle and the fact that V 2 (log r) = 0, show that 


m(R) log 



0 m(b) log 



m(a) log 



where m(R) is the maximum value of 0 on the circle r = R, with a < R < b. 
(a) Prove Theorem 7.4. (b) Hence show that the solution of the initial- 
boundary value problem given by (7.30) to (7.32) is unique. 

Consider the initial-boundary value problem 


0t = iLV 2 0 — 0 3 for x G D, t > 0, 


with k > 0, subject to 

0(x, 0) = 0o (x) for x G D, 


7.10 

7.11 


and 

^ = 0 for x G dD. 
on 

Use Theorem 7.5 to show that 0 — ► 0 as t — > oo, uniformly in D. 

Show that the angle between two line segments that intersect at z = zo 
in the complex z-plane is unchanged by a conformal mapping £ = f(z), 
provided that f'(zo) is bounded and nonzero. 

Show that, if 0(x, y) = 4>(ai(£, rf),y(£ , rj)), £ = £ + i ?7 and z = x + iy, then 


0 2 0 <9 2 0 

dx 2 dy 2 


dC 2 (< 9 2 $ d 2 <i> \ 
dz \ d^ 2 dy 2 ) 


7.12 Show that the mapping ( = sin (itz/2k) maps a semi-infinite strip in the 
2 -plane to the upper half (f-plarie. 

7.13 Show that the image of a circle of radius r in the 2 -plane under the trans- 
formation ( = (z + 1/z) /2 is an ellipse. 
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7.14 Find the function T( x, y) that is harmonic in the lens-shaped region defined 
by the intersection of the circles \z — i\ = 1 and \z — 1| = 1 and takes the 
value 0 on the upper circular arc and 1 on the lower circular arc. 

Hint: Consider the effect of the transformation ( = z/ (z — 1 — i) on 
this region. 

7.15 Find the lift on an aerofoil section that consists of a unit circle centred on 
the origin, joined at z = 1 to a horizontal flat plate of unit length, when a 
unit stream is incident at an angle of 45°. Hint: Firstly apply a Joukowski 
transformation to map the section to a straight line, and then use an inverse 
Joukowski transformation to map this straight line to a circle. 
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CHAPTER EIGHT 


Existence, Uniqueness, Continuity and 
Comparison of Solutions of Ordinary Differential 
Equations 


Up to this point in the book we have implicitly assumed that each of the differential 
equations that we have been studying has a solution of the appropriate form. This 
is not always the case. For example, the equation 

has no real solution for \y\ < 1. When we ask whether an equation actually has a 
solution, we are considering the question of existence of solutions. A related issue 
is that of uniqueness of solutions. If we have a differential equation for which 
solutions exist, does the prescription of initial conditions specify a unique solution? 
Not necessarily. Consider the equation 

^ = 3 j/ 2 / 3 subject to y( 0) = 0 for y > 0. 

If we integrate this separable equation we obtain y = (t — c) 3 . The initial condition 
gives c = 0 and the solution y = t 3 . However, it is fairly obvious that y = 0 is 
another solution that satisfies the initial condition. In fact, this equation has an 
infinite number of solutions that satisfy the initial condition, namely 

[ 0 for 0 < t < c, 

"I \ (t — c) 3 for t > c, 


for any c ^ 0. 

Another question that we should consider arises from the process of mathematical 
modelling. If a differential equation provides a model of a real physical system, 
how sensitive are the solutions of the differential equation to changes in the initial 
conditions? If we performed two experiments with nearly identical initial conditions 
(exact repetition of an experiment being impossible in practice), should we expect 
nearly identical outcomes? 

In this chapter we will prove some rigorous results related to the issues we have 
discussed above. 
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8.1 Local Existence of Solutions 

We will start with the simplest type of scalar differential equation by proving a 

local existence theorem for 

dy 

— = f(y,t) subject to y(to) = yo in the domain It — t 0 | < a. (8.1) 
dt 

Here, a > 0 defines the size of the region where we will be able to show that a 
solution exists. We begin by defining a closed rectangle, 

R = {(y,t) | \y-yo\^b, |t-t 0 |<a}, 

centred upon the initial point, (yo,to), within which we will make certain assump- 
tions about the behaviour of /. If we integrate (8.1) with respect to t and apply 
the initial condition, we obtain 

y(t) = yo+f f(y(s),s)ds. (8.2) 

Jt 0 

This is an integral equation for the unknown solutions, y = y(t), which is equivalent 
to the original differential equation. Our strategy now is to use this to produce a 
set of successive approximations to the solution that we seek. We will do this 
using the initial condition as the starting point. 

We define 

2/o (t) = 2/o i 

Vi(t) = 2/o+ / f(yo,s)ds, 

Jt° 

2 / 2 (*) = 2/o + f{yi(s),s) ds, (8.3) 


2/fc+iW = 2/o + / f(yk(s),s)ds. 

Jt 0 

As an example of how this works, consider the simple differential equation y' = y 
subject to y(0) = 2. In this case, 

2/o W = 2, 

Vi (t) = 2 + / 2 ds = 2(1 + t), 

t 

2 / 2 (^) = 2 + J 2(1 + s ) ds = 2 ^1 + t + — t 2 

fc+i 

j/ H1 (t) = 2^1f. 

' n! 

n = 0 

As k — > oo, {j/fe(t)} — » 2e*, the correct solution. In general, if the sequence of 
functions (8.3) converges uniformly to a limit, {yk(t)} — > 2/oo(£)> then yooit) is a 
continuous solution of the integral equation. For continuous f(y,t), we can then 
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differentiate under the integral sign, to show that y is a solution of the original 
differential equation, (8.1). We will shortly prove that, in order to ensure that the 
sequence (8.3) converges to a continuous function, it is sufficient that f(y,t ) and 
are continuous in R. In other words, we need /, df/dy € C 0 (ii).t 
As a preliminary to the main proof, recall Theorem A2.1, which says that a 
function continuous on a bounded region is itself bounded, so that there exist 
strictly positive, real constants M and K such that | f(y, t)\ < M and | ^ (y, t ) 


< K 


for all (y,t) € R. If (t/i,f) and {y 2 ,t) are two points in R, Theorem A2.2, the mean 
value theorem, states that 


df 

f(y 2 , t) - f(yi,t) = Q^{c, t)(y 2 ^ J/i ) for some c with y 1 < c < y 2 . 


Since (c, t) € R , we have |^(c, t) 


< K , and hence 


- f(yi,t)\ < K\y 2 - 2 / 1 1 V(2/ 2 , t), (yi,t) € R. (8.4) 

Functions that satisfy the inequality (8.4) are said to satisfy a Lipschitz condition 
in R. It is possible for a function to satisfy a Lipschitz condition in a region 
R without having a continuous partial derivative everywhere in R. For example, 
f(y,t) = t\y\ satisfies \f(y 2 ,t) — f(yi,t)\ < \y 2 — yi\ in the unit square centred on 
the origin, so that it is Lipschitz with K = 1. However, df/dy is not continuous on 
the line y = 0. Our assumption about the continuity of the first partial derivative 
automatically leads to functions that satisfy a Lipschitz condition, which is the key 
to proving the main result of this section. 

As we stated earlier, we are going to use the successive approximations (8.3) to 
establish the existence of a solution. Prior to using this, we must show that the 
elements of this sequence are well-defined. Specifically, if yk+i(t) is to be defined on 
some interval I, we must establish that the point ( y^ (s) , s ) remains in the rectangle 
R = {(y,t) | \y-yo\^b, \t - t 0 \ ^ a} for all s G I. 


Lemma 8.1 If a = min (a, b/M), then the successive approximations, 

2/o (t) = 2/o, Vk+i(t) =y 0 + f{yk(s),s) ds 

Jt 0 

are well-defined in the interval I = {t \ \t — to\ < a}, and on this interval \yk(t) — 
2/o | < M\t — to | ^ b, where |/| < M . 


Proof We proceed by induction. It is clear that yo(t) is defined on I, as it is 
constant. We assume that 

yn(t) = yo+ f(y n -i(s),s)ds 
Jt . 0 


f If these conditions hold in a larger domain, D , we can use the local result repeatedly until the 
solution moves out of D. 
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is well-defined on I, so that the point (y n (t),t) remains in R for t £ I. By definition, 


y n+1 (t) = y 0 + f(y n (s),s)ds, 
•I t 0 


so we have y n +i(t) defined in I. Now 


\y n +i{t) - yo I = 


.f(yn(s),s) ds 


< 


[ \.f(yn(s),s)\ ds < M\t — to\ ^ Mot < b, 

Jtn 


which is what we claimed. 


□ 


To see rather less formally why we chose a = min(a, b/M), note that the condi- 
tion \f(y,t)\ < M implies that the solution of the differential equation, y = y(t ), 
cannot cross lines of slope M or — M through the initial point (yo,to), as shown 
in Figure 8.1. The relationship | yk(t) — yo\ < M\t — fo|> which we established in 
Lemma 8.1, means that the successive approximations cannot cross these lines ei- 
ther. These lines intersect the boundary of the rectangle R , and the length of the 
interval I depends upon whether they meet the horizontal sides (a = b/M) or the 
vertical sides (a = a), as shown in Figure 8.1. 

We can now proceed to establish the main result of this section. 

Theorem 8.1 (Local existence) If f and Of /dy are in C°(R), then the successive 
approximations yk{t), defined by (8.3), converge on I to a solution of the differential 
equation y' = f(y,t) that satisfies the initial condition y(tff) = yo. 


Proof We begin with the identity 

yj{t) = yo{t) + {yi (t) - yo{t)} + {y 2 (t) - yi(t)} 


{yj(t) -yj~i(t)} 


j - 1 


= 2/0 (t) + Yl {2/«+i( < ) _ 2 ■ 


(8.5) 


n— 0 


In order to use this, we need to estimate the value of y n +i(t) — y n (t). Using the 
definition (8.3), we have, for n ^ 1 and 1 1 — to I < a t 


\y n +l{t) - y n (t)\ = 


{ f{yn{s),S ) - f(y n -l(s),s)} ds 


to 


< / \f(yn(s),s)-f(y n - 1 (s),s)\ds^K \y n (s) - y n -i(s)\ ds. 
Jtn Jtn 


For n = 0 we have 


\yi(t) -yo(t)\ = 


f(yo(s),s) ds 


< M\t — 1 0 |, 


by using the continuity bound on /. If we now repeatedly use these inequalities, it 
is straightforward to show that 

MK n \t — tn \ n+1 

\Vn+i(t) - y n (t) I < for n > 1 , \t - t 0 | < a, 

(n + 1)! 



(i) a=b/M 
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(y 0 + b, t 0 ) 



i 


(y 0 ’ t o + a > 


(ii)a=a 


(y 0 + b, to) 



I 


(y 0 ’ t o +a ) 


Fig. 8.1. The two cases that determine the length of the interval, a. 


and hence that 


\yn+i(t) - y n (t)\ < 


M(Kcx) n+1 
K(n + 1)! ' 


( 8 . 6 ) 


If we denote by y(t) the limit of the sequence {yk(t)}, we now have, from (8.5), 


y(t) = yo(t) + ivn+iit) - y n (t)} ■ (8.7) 

n— 0 


Each term in this infinite series is dominated by M(Ka) n+1 /K(n + 1)!, a positive 
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constant; and 


M y-' (Ka) n+1 M , Ka 
I< ^ (n+1)! K 1 ’ 

n— 0 7 


( 8 . 8 ) 


which is a finite constant. The comparison theorem applied to the series in (8.7) 
shows that it converges absolutely on the interval \t — t Q \ < a , and we have now 
proved that the sequence of successive approximations, (8.3), converges. We can 
now return to Lemma 8.1 with the knowledge that y{t) = lim k^ooVk(t) is well- 
defined, and, since the right hand side of (8.8) is independent of k, immediately 
claim that ( y(s),s ) £ R. 

We now show that the limiting function, y(t), actually satisfies the integral 
equation, (8.2). Now 


I V(t) - Uk(t)\ 


Y {yn+i{t) - y n (t)} 

n=k 


^ M ^ (. Ka) m+1 
^ AT (to + 1)! 

m=k 


M (. Ka) k+1 ^ (Ka) m M ( Ka) k+1 Kn r . 

= f ° r| ( ( 8.9) 

v 7 m= 0 v 7 

Using the Lipschitz condition in the interval 1 1 — to\ < ol and Lemma 8.1, 


/ {.f(y(s),s) - f(y k (s),s)} ds 
'to 

For fixed a, M and A", 


< AT / \y(s) - y k (s) \ ds < a 
Jt 0 


M(Ka) k+1 Ka 
(jfe + 1)! 


M(Ka) k+1 Ka n , 

a — — — — e -9-0 as k — > oo, 

(k + 1)! 

so that 

lim / f(y k (s),s)ds = f(y(s),s)ds, 
k ^°° Jto Jto 

and consequently y(t) satisfies (8.2). 


□ 


To prove that y(t) is continuous on \t — t 0 \ < a, we can use some of the auxiliary 
results that we derived above. Consider 


y(t + h) - y(t) = y(t + h) -y k {t + h) + y k (t + h) - y k (t) + y k (t) - y{t), 
so that, using the triangle inequality, 

I y(t + h) — y(t)\ < | y(t + h) -y k (t + h)\ + \ y k (t + h) - y k (t) \ + \y k (t) - y(t) \. 
By choosing k sufficiently large, using the estimate (8.9), we have 

I Vk{t) - y(t ) | < e k , 

so that 

I y(t + h) - y(t ) | < 2e k + \y k (t + h ) - y k (t)\ < e, 
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for sufficiently small h, using the continuity of Ukit)- Hence, y(t) is continuous, as 
we claimed at the beginning of this section, and the equivalence of the integral and 
differential equations can be established by differentiation under the integral sign. 


Remarks 

(i) For a particular differential equation, Theorem 8.1 is easy to use. For ex- 
ample, if y' = y + t, both f(y,t) = y + t and df /dy = 1 are continuous for 
— oo < t < oo, — oo < y < oo, so the successive approximations are guar- 
anteed to converge in this domain. If y' = 3 y 2 ^ 3 , f{y,t) = 3 y 2 ^ 3 , which is 
continuous in — oo < y < oo, — oo < t < oo. However, its partial derivative, 
Of /dy = 2j/ -1 / 3 , is discontinuous at y = 0. Consequently, the successive 
approximations are not guaranteed to converge. This should be no surprise, 
since, as we noted in the introduction to this chapter, the solution of this 
equation is not unique for some initial data. 

(ii) Consider the equation y' = y 2 subject to y( 0) = 1. Since this equation 
satisfies the conditions of Theorem 8.1, the successive approximations will 
converge, and therefore a solution exists in some rectangle containing the 
initial point, namely 

R = {( y,t ) | \t\ < a, \y- 1| < b} . 

Let’s now try to determine the values of the constants a and b. Since M = 
max/? y 2 = (1 + b) 2 , a = min(a, b / (1 + b) 2 ). From simple calculus, 

b 1 
b>o (1 + b) 2 4 

so that, independent of the particular choice of a, a ^ Theorem 8.1 
therefore shows that a solution exists for |i| ^ In fact, the solution, 
y = 1/(1 — t), exists for ^oo < t < 1. It is rather more difficult to determine 
regions of global existence like this, than it is to establish local existence, 
the point of Theorem 8.1, and we will not attempt it here. 

(iii) We can also consider the local existence of solutions of systems of first order 
equations in the form 

y' = f (y^) ! (8-10) 

where 


/ Vi \ 


/ fi(yi,V 2 ,- 

■ ,Vn,t ) \ 

yi 


/2 (3/1 , 2/2 , • 

• ,Vn,t) 


, f = 



V Vn / 


V fn{y\,V 2 ,- 

• • ,Vn,t) / 


are vectors with n components and t is a real scalar. The vector form of 
the Lipschitz condition can be established by requiring f and di/dyi to be 
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continuous with respect to all of their components and t, so that 



for all (y, t) £ D C R" +1 , where 1 1.| | is a suitable vector norm. It then follows 
that for any points (y ,t) and (z ,t) in D , ||f(y,t) — f (z, t) 1 1 < K\\y — z||. The 
proof then follows that of Theorem 8.1 step by step and line by line, with 
modulus signs replaced by vector norms. 

(iv) For a selection of results similar to, but in some ways better than, Theo- 
rem 8.1, see Coddington and Levinson (1955). 


8.2 Uniqueness of Solutions 

We will now show that, under the same conditions as Theorem 8.1, the solution of 
the initial value problem (8.1) is unique. In order to prove this, we need a result 
called Gronwall’s inequality. 


Lemma 8.2 (Gronwall’s inequality) If f(t) and g(t) are non-negative functions 
on the interval a ^ t ^ f3, L is a non-negative constant and 

/(t)<L + f f(s)g(s)ds for t £ [a, /?], 

J a. 

then 


f{t) ^ L exp l / g(s)ds\ forte[a,/3\ 


Proof Define 

r* 

h(t) =L+ f(s)g(s) ds , 

J a. 

so that h(a) = L. By hypothesis, f(t) ^ h(t), and by the fundamental theorem of 
calculus, since g(t) ^ 0, we have 

h'(t) = f(t)g(t) < h(t)g(t) for t e [a,/3\. 
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Theorem 8.2 (Uniqueness) If f, ^ £ C°(R), then the solution of the initial 
value problem y' = f(y,t) subject to y(to) = yo is unique on \ t — fo| < ct. 


Proof We proceed by contradiction. Suppose that there exist two solutions, y = 
yi(t) and y = 2/2 (t), with 2/1 (to) = 2/2 (to) = Vo- These functions satisfy the integral 
equations 


2/i (t) = 2 / 0 +/ f(yi(s),s)ds, y 2 (t) 
1 1 0 


2/o 


/( 2 / 2 (s),s) ds, 


' to 


( 8 . 11 ) 


and hence the points (jji(s),s) lie in the region R. Taking the modulus of the 
difference of these leads to 


\yi{t) -2/2(01 



{.f(yi(s),s) - f(y 2 (s),s)} ds 


€ 


I \{f(yi(s),s) - f(y 2 (s),s)}\ ds ^ ( K\y 1 (s) — y 2 (s) \ ds for \t - t 0 \ < a, 

j to J to 


where K > 0 is the Lipschitz constant for the region R. We can now apply 
Gronwall’s inequality to the non-negative function \yi(t) — y 2 (t)\ with L = 0 and 
g(s) = K > 0 to conclude that \yi(t) — y 2 (t)\ ^ 0. However, since the modulus of 
a function cannot be negative, \yi(t) — y 2 (t)\ = 0, and hence yi(t) = y 2 (t). □ 


Just as we saw earlier when discussing the existence of solutions, this uniqueness 
result can be extended to systems of ordinary differential equations. Under the 
conditions given in Remark (iii) in the previous section, the solution of (8.10) exists 
and is unique. 


8.3 Dependence of the Solution on the Initial Conditions 

A solution of y' = f(y,t ) that passes through the initial point (yo, to) depends 
continuously on the three variables to, yo and t. For example, y' = 2>y 2 ^ subject 
to y(to) = yo has solution y = (t — to + y^ 3 ) 3 , which depends continuously on to, 
yo and t. Earlier, we hypothesized that solutions of identical differential equations 
with initial conditions that are close to each other should remain close, at least for 
values of t close to the initial point, t — to- We can now prove a theorem about 
this. 

Theorem 8.3 Let y = yo(t) be the solution of y' = f(y,t ) that passes through the 
initial point (to>2/o) an d V = Vi(t) the solution that passes through (f 1 , 2 / 1 )- If f , 
df/dy £ C°(R) and both yo(t) and yi(t) exist on some interval a < t < /3 with t 0 , 
t\ € (a, pi), then Ve > 0 3<5 > 0 such that if |to — U| < b and \yo ~ yi\ < b then 
\yo(t) - yi(t)\ < e Vt G (a,P). 
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Proof Since yi(t) and y 2 (t) satisfy the integral equations (8.11), we can take their 
difference and, by splitting up the range of integration, writef 

yo(t) - yi(t) = y 0 - yi + [ f{y 0 {s),s)ds+ I {f{yo(s),s) - f(yi(s),s)} ds. 

Jt 0 Jtx 

Taking the modulus gives 


\yo{t) - yi(t)\ \y 0 


yi\ + 



f(yo(s),s) ds 


{f(yo(s),s) - f(yi(s),s)} ds 


<i\yo-yi\+M(t 1 -t 0 )+ [ K\y 0 (s) — 2/1 (s) | ds, 

Jt 1 

by using the upper bound on / and the Lipschitz condition on R. If \t\ — to I < $ 
and | yi — y 0 \ <6, this reduces to 

\yo(t) - yi(t)\ < (M + 1)6 + f K\y 0 (s ) - yi(s)\ ds. 

Jt 1 

We can now apply Gronwall’s inequality to obtain 

|s/o(*) - yi(i)\ < i M + 1 )6e xp | J* Kds^j = (M + l)6exp {K(t - h)} , 

and, since \t — t\\ < (3 — a, we have 

\yo{t) - yi(t)\ < (M + 1)6 exp {K{(3 - a)} . 

If we now choose 6 < eexp{— K((3 — a)} /(M + 1), we have \yo(t) — yi(t)\ < e, and 
the proof is complete. □ 


8.4 Comparison Theorems 

Since a large class of differential equations cannot be solved explicitly, it is useful to 
have a method of placing bounds on the solution of a given equation by comparing 
it with solutions of a related equation that is simpler to solve. For example, y' = 
e~ ty subject to y{ 0) = 1 is a rather nasty nonlinear differential equation, and the 
properties of its solution are not immediately obvious. However, since 0 ^ e~ ty ^ 1 
for 0 < t < 00 and 0 < y < 00 , it would seem plausible that 0 < y' < 1. Integration 
of this traps the solution of the initial value problem in the range 1 ^ y ^ 1 + 1. In 
order to prove this result rigorously, we need some preliminary results and ideas. 
Note that, to simplify things, we will prove all of the results in this section for one 
side of a point only. Firstly, we need some definitions. 

A function F(y,t) satisfies a one-sided Lipschitz condition in a domain D if, 
for some constant K , 

V 2 >yi=> F(y 2 ,t) - F(y 1 ,t) < K(y 2 - yi) for (y 2 ,t), (yi,t) G D. 

f We have assumed that t\ > to, but the argument needs only a slight modification if this is not 
the case. 
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If aft) is differentiable for t > a, then a' ft) ^ K is called a differential in- 
equality. Note that if a' ft) < 0 for t ^ a, then a(t) ^ cr(a). 

Lemma 8.3 If aft) is a differentiable function that satisfies the differential inequal- 
ity a’ ft) ^ Kaft) for t G [a, b\ and some constant K, then aft) ^ a{a)e K< d~ a ' > for 
t G [a, b\ . 

Proof We write the inequality as a 1 ft) — Kaft) ^ 0 and multiply through by 
e~ Kt > 0 to obtain 

This leads to e~ Kt aft ) ^ e~ Ka a(a), and hence the result. Note that this is really 
just a special case of Gronwall’s inequality. □ 

Lemma 8.4 If g ft) is a solution of y' = Ffy,t) for t ^ a and fit) is a function 
that satisfies the differential inequality f ft) ^ F{fff),f) fort ^ a with f(a) = g(a), 
then, provided that F satisfies a one-sided Lipschitz condition for t ^ a, fft) ^ gft) 
for all t ^ a. 

Proof We will proceed by contradiction. Suppose that /(ti) > g{ti) for some 
t\ > a. Define to to be the largest t in the interval a < t < t\ such that fft) < gft), 
and hence /(to) = gfto)- Define aft) = fft) — g(t), so that a(t) ^ 0 for f 0 ^ t ^ t\ 
and cr(to) = 0. The one-sided Lipschitz condition shows that 

o\t) = /'(*) - gft. ) < F(f(t),t) - F(gft),t) ^ K {fft) - gft)} = Kaft), 

and hence a' ft) < Kaft) for t > t 0 . By Lemma 8.3, aft) < a{to)e K ^~ to ^ ^ 0. 
However, since aft) is non-negative for t ^ to, aft) = 0. This contradicts the 
hypothesis that a(ti) = f{t\) — gft \) > 0, and hence the result is proved. □ 

Theorem 8.4 Let fft) and gft) be solutions of the differential equations y' = 
Ffy,t) and z' = G(z,t) respectively, on the strip a ^ t ^ b, with f(a) = g(a). If 
Ffy,t) ^ Gfy,t) and F or G is one-sided Lipschitz on this strip, then fft) ^ gft) 
for a ^ t ^ b 

Proof Since y' = Ffy,t) < G(y,t), g satisfies a differential inequality of the form 
described in Lemma 8.4, and / satisfies the differential equation. This gives us the 
result immediately. □ 

Note that Theorem 8.4 and the preceding two lemmas can be made two-sided in 
the neighbourhood of the initial point with minor modifications. 

Comparison theorems, such as Theorem 8.4, are of considerable practical use. 
Consider as an example the initial value problem 

y' = t 2 + y 2 subject to y(0) = 1. (8-12) 

It is known, either from analysis of the solution or by numerical integration, that 
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the solution of this equation ‘blows up’ to infinity at some finite value of t = t^. We 
can estimate the position of this blowup point by comparing the solution of ( 8 . 12 ) 
with the solutions of similar, but simpler, differential equations. Since t 2 + y 2 ^ y 2 
for 0 ^ t < oo, the solution of ( 8 . 12 ) is bounded below by the solution of y' = y 2 
subject to y(0) = 1, which is y = 1/(1 — t). Since this blows up when t = 1, we must 
have too ^ I- Also, since t 2 +y 2 ^ 1 + y 2 for 0 ^ t ^ 1, an upper bound is provided 
by the solution of y' = 1 + y 2 subject to y( 0) = 1, namely y = tan (t + |). This 
blows up when t = 7t/ 4. By sandwiching the solution of (8.12) between two simpler 
solutions, we are able to conclude that the blowup time satisfies | ^ < 1. 

Comparison theorems are also available for second order differential equations. 
These take a particularly simple form for the equation 

y"{x) + g{x)y{x) = 0, (8.13) 

from which the first derivative of y is absent. In fact, we can transform any lin- 
ear, second order, ordinary differential equation into this form, as we shall see in 
Section 12.2.7. We will compare the solution of (8.13) with that of 

z" (x) + h{x)z(x) = 0, (8-14) 

and prove the following theorem. 

Theorem 8.5 If g{x) < h(x) for x ^ Xq, y{x) is the solution of (8.13) withy(xo) = 
Vo 7^ 0, y'(x o) = 2/i; these conditions being such that y{x) > 0 for x$ ^ x ^ x\, and 
z(x) is the solution of (8.1 f) with z(x o) = yo and z'(xq) = yi, then y(x) > z(x) for 
Xq < x ^ x\, provided that z(x) > 0 on this interval. 

Proof From (8.13) and (8.14), 

y"z - yz" = ( h - g) yz. 

Integrating this equation from xq to x < Xj , we obtain 

y'(x)z(x ) - y(x)z'(x) = f ( h(t ) - g{t)) y(t)z(t) dt. 

J Xq 

By hypothesis, the right hand side of this is positive. Also, by direct differentiation, 
a ■_ ( ytfcY \ = y’{x)z{x) - y(x)z'(x) 
dx V 2: ( a; ) ) z 2 ( x) 

so that y/z is an increasing function of x. Since y(xo)/z(xo) = 1, we have y(x) > 
z(x) for Xq < x ^ X\. □ 

As an example of the use of Theorem 8.5, consider Airy’s equation, 

y(t) - ty(t) = 0 

(see Sections 3.8 and 11.2), in — 1 < t < 0 with 2/(0) = 1 and 2/(0) = 0. By 
making the transformation t i— > — x, we arrive at y" (x) + xy(x) = 0 in 0 < x < 1, 
with //(O) = 1 and 2/(0) = 0. The solution of this is positive for 0 < x < X\. 
If we consider z" (x) + z(x) = 0 with ^(0) = 1 and 2/(0) = 0, then clearly z = 
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cos a;, which is positive for 0 ^ x < 7r/2, and hence for 0 ^ x ^ 1. Since the 
equations and boundary conditions that govern y[x) and z(x) satisfy the conditions 
of Theorem 8.5, we conclude that y(x) > cosx for 0 < x < 1. In fact, we can see 
from the exact solution, 

. . Bi , (0)Ai(— x) — Ai'(0)Bi(— x) 

V ^ X ’ = Bi'(0)Ai(0) - Ai'(0)Bi(0) ’ 

shown in Figure 8.2, that X\ « 1.986. 



Fig. 8.2. The exact solution of y" + xy = 0 subject to j/(0) = 1 and ?/(()) = 0. 


Exercises 

8.1 Determine the integral equations equivalent to the initial value problems 

(a) y' = t 2 + y 4 subject to y( 0) = 1, 

(b) y' = y + t subject to y( 0) = 0, 

and determine the first two successive approximations to the solution. 

8.2 Show that the functions 

(a) f(y,t) = te ~ y2 for \t\ < 1, \y\ < oo, 

(b) f(y, t.)=t 2 + y 2 for \t\ < 2, |y| < 3, 

are Lipschitz in the regions indicated, and find the Lipschitz constant, K, 
in each case. 
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8.3 How could successive approximations to the solution of y' = 3 y 2 ^ 3 fail to 
converge to a solution? 

8.4 Let f(y, t) and g(y, t) be continuous and satisfy a Lipschitz condition with 
respect to y in a region D. Suppose \f(y, t) — g(y,t) \ < e in D for some 
e > 0. If 1/1 (t) is a solution of y' = f(y,t ) and 2 / 2 ( 1 ) is a solution of 
2 / = g(y,t), such that I 2 / 2 (^o) — 2/1 (to) I < 8 for some to and 6 > 0, show 
that, for all t for which yi(t) and 2 / 2 ( 1 ) both exist, 

1 2 / 2 (t) - yi(t)\ < 8 exp (K\t - t 0 \) + {exp (K\t — t 0 |) - 1} , 

A 

where K is the Lipschitz constant. Hint: Use the Gronwall inequality. 

8.5 Find upper and lower bounds to the solutions of the differential equations 

(a) 2 / = sin(xt/) subject to 2 /( 0 ) = 1/2 for x ^ 0, 

(b) 2 / = y 3 — y subject to 2/(0) = 1/4. 

8.6 If a(t.) G C^a, 0 + e] and positive satisfies the differential inequality a' < 
K a log a, show that 

a(t) ^ a(a)e K ^~ a •* for t € [a, a + e]. 

8.7 For each fixed x, let F(x,y) be a nonincreasing function of y. Show that, 
if f(x) and g(x) are two solutions of y' = F(x,y) and b > a, then \f(b) — 
3(6)1 < | f(a) — g(a) |. Hence deduce a result concerning the uniqueness of 
solutions. This is known as the Peano uniqueness theorem. 



CHAPTER NINE 


Nonlinear Ordinary Differential Equations: Phase 
Plane Methods 


9.1 Introduction: The Simple Pendulum 

Ordinary differential equations can be used to model many different types of phys- 
ical system. We now know a lot about second order linear ordinary differential 
equations. For example, simple harmonic motion, 


^+^ = 0 . ( 9 . 1 ) 

describes many physical systems that oscillate with small amplitude 9. The general 
solution is 9 = A sin ut + B cos Lit, where A and B are constants that can be fixed 
from the initial values of 9 and d9/dt. The solution is an oscillatory function of t. 
Note that we can also write this as 9 = Ce lut + De~' lut , where C and D are complex 
constants. In the real world, the physics of a problem is rarely as simple as this. 
Let’s consider the frictionless simple pendulum, shown in Figure 9.1. A mass, m, 
is attached to a light, rigid rod of length l, which can rotate without friction about 
the point O. 



Fig. 9.1. A simple pendulum. 
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Using Newton’s second law on the force perpendicular to the rod gives 

• n i d2 ° 

—mg sin 9 = ml , 

and hence 

+ u 2 sin 9 = 0, (9.2) 

dt z 

where u 2 = g/l and t is time. For oscillations of small amplitude, 9 <C 1, so that 
sin# ~ 9 , and we obtain simple harmonic motion, (9.1). If 9 is not small, we must 
study the full equation of motion, (9.2), which is nonlinear. Do we expect the 
solutions to be qualitatively different to those of simple harmonic motion? If we 
push a pendulum hard enough, we should be able to make it swing round and round 
its point of support, with 9 increasing continuously with t (remember there is no 
friction), so we would hope that (9.2) has solutions of this type. 

In general, nonlinear ordinary differential equations cannot be solved analyti- 
cally, but for equations like (9.2), where the first derivative, d9/dt does not appear 
explicitly, an analytical solution is available. Using the notation 9 = d9/dt, the 
trick is to treat 9 as a function of 9 instead of t. Note that 

d 2 9 _ d fd9_\ _d9_ dfidB _ -d0 _ - 2 \ 

dt 2 dt \dt ) dt dt dO d9 d9 \ 2 ) 

This allows us to write (9.2) as 


which we can integrate once to give 

^ 9 2 = u) 2 cos 9 + constant. 
Using co 2 = g/l, we can write this as 


^ ml 2 9 2 


mgl cos 9 = E. 


(9.3) 


This is just a statement of conservation of energy, E, with the first term representing 
kinetic energy, and the second, gravitational potential energy. Systems like (9.2), 
which can be integrated once to determine a conserved quantity, here energy, 
E, are called conservative systems. Note that if we try to account for a small 
amount of friction at the point of suspension of the pendulum, we need to add a 
term proportional to d9/dt to the left hand side of (9.2). The system is then no 
longer conservative, with dramatic consequences for the motion of the pendulum 
(see Exercise 9.6). 

From (9.3) we can see that 
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We can integrate this to arrive at the implicit solution 

f e d9' 

t = ± 


2 E 
ml 2 


cos 9' 


(9.4) 


where 9$ is the angle of the pendulum when t = 0. Note that the two constants of 
integration are E and 0 O , the initial energy and angle of the pendulum. Equation 
(9.4) is a simple representation of the solution, which, if necessary, we can write in 
terms of Jacobian elliptic functions (see Section 9.4), so now everything is clear . . . , 
except of course that it isn’t! Presumably equation (9.4) gives oscillatory solutions 
for small initial angles and kinetic energies, and solutions with 9 increasing with t 
for large enough initial energies, but this doesn’t really leap off the page at you. 
From (9.4) we have a quantitative expression for the solution, but we are really 
more interested in the qualitative nature of the solution. 

Let’s go back to (9.3) and write 


0 2 


2 E 
ml 2 


2 g 
l 


cos 9. 


Graphs of 0 2 as a function of 9 are shown in Figure 9.2(a) for different values of E. 


— For E > mgl the curves lie completely above the 0-axis. 

— For —mgl < E < mgl the curves intersect the 0-axis. 

- For E < — mgl the curves lie completely below the 0-axis (remember, — 1 ^ 

COS0 < 1). 

We can now determine 0 as a function of 0 by taking the square root of the curves 
in Figure 9.2(a) to obtain the curves in Figure 9.2(b), remembering to take both 
the positive and negative square root. 


- For E > mgl , the curves lie either fully above or fully below the 0-axis. 

- For —mgl < E < mgl , only finite portions of the graph of 0 2 lie above the 0-axis, 
so the square root gives finite, closed curves. 

- For E < —mgl, there is no real solution. This corresponds to the fact that the 
pendulum always has a gravitational potential energy of at least —mgl, so we 
must have E ^ —mgl. 

As we shall see later, the solution with E = mgl is an important one. The graph 
of 0 2 just touches the 0-axis at 0 = ±(2n — l)7r, for n = 1,2,... , and taking the 
square root gives the curves that pass through these points shown in Figure 9.2(b). 

How do 0 and 0 vary along these solution curves as t increases? If 0 is positive, 0 
increases with t, and vice versa (remember, 0 = d9/dt is, by definition, the rate at 
which 0 changes with t ). This allows us to add arrows to Figure 9.2(b), indicating 
in which direction the solution changes with time. We have now constructed our 
first phase portrait for a nonlinear ordinary differential equation. The (0, 0)- 
plane is called the phase plane. Each of the solution curves represents a possible 
solution of (9.2), and is known as an integral path or trajectory. If we know the 

initial conditions, 0 = 0 q, 9 = 9q — J + ^f- cos 0 q, the integral path that passes 
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through the point (9 0 ,9 0 ) when t = 0 represents the solution. Finally, note that, 
since 9 = 7r is equivalent to 9 = — 7r, we only need to consider the phase portrait for 
—7 r ^ 9 ^ 7T. Alternatively, we can cut the phase plane along the lines 9 = —n and 
9 = 7r and join them up, so that the integral paths lie on the surface of a cylinder. 

(a) 



(b) 



Fig. 9.2. (a) 6 2 as a function of 9 for different values of E. (b) 6 as a function of 6 
the phase portrait for the simple pendulum. The cross at (0, 0) indicates an equilibrium 
solution. The arrows indicate the path followed by the solution as t increases. Note that 
the phase portrait for \6\ ^ 7r is the periodic extension of the phase portrait for |#| ^ 7r. 


Having gone to the trouble of constructing a phase portrait for (9.2), does it 
tell us what the pendulum does in a form more digestible than that given by 
(9.4)? Let’s consider the three qualitatively different types of integral path shown 
in Figure 9.2(b). 


(i) Equilibrium solutions The points 9 = 0, 9 = 0 or ±7r, represent the two 
equilibrium solutions of (9.2). The point (0,0) is the equilibrium with the 
pendulum vertically downward, (7r, 0) the equilibrium with the pendulum 
vertically upward. Points close to (0,0) lie on small closed trajectories close 
to (0,0). This indicates that (0,0) is a stable equilibrium point, since 
a small change in the state of the system away from equilibrium leads to 
solutions that remain close to equilibrium. If you cough on a pendulum 
hanging downwards, you will only excite a small oscillation. In contrast, 
points close to (7T, 0) lie on trajectories that take the solution far away from 
(•7T, 0), and we say that this is an unstable equilibrium point. If you 
cough on a pendulum balanced precariously above its point of support, it 
will fall. Of course, in practice it is impossible to balance a pendulum in 
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this way, precisely because the equilibrium is unstable. We will refine our 
definitions of stability and instability in Chapter 13. 

(ii) Periodic solutions Integral paths with —mgl < E < mgl are closed, and 
represent periodic solutions. They are often referred to as limit cycles or 
periodic orbits. The pendulum swings back and forth without reaching the 
upward vertical. These orbits are stable, since nearby orbits remain nearby. 
Note that the frequency of the oscillation depends upon its amplitude, a 
situation that is typical for nonlinear oscillators. For the simple pendulum, 
the amplitude of the motion is 0 max = cos -1 ( E/mgl ), and (9.4) shows that 
the period of the motion, T, is given by 



Small closed trajectories in the neighbourhood of the equilibrium point at 
(0, 0) are described by simple harmonic motion, (9.1). Note that, in contrast 
to the full nonlinear system, the frequency of simple harmonic motion is 
independent of its amplitude. The idea of linearizing a nonlinear system 
of ordinary differential equations close to an equilibrium point in order to 
determine what the phase portrait looks like there, is one that we will return 
to later. 

In terms of the phase portrait on the cylindrical surface, trajectories with 
E > mgl are also stable periodic solutions, looping round and round the 
cylinder. The pendulum has enough kinetic energy to swing round and 
round its point of support. 

(iii) Heteroclinic solutions The two integral paths that connect (— n, 0) and 
(7 r, 0) have E = mgl. The path with 9 ^ 0 has 9 — » ±7r as t — > ±00 (we 
will prove this later). This solution represents a motion where the pendu- 
lum swings around towards the upward vertical, and is just caught between 
falling back and swinging over. This is known as a heteroclinic path, since 
it connects different equilibrium points.) Heteroclinic paths are important, 
because they represent the boundaries between qualitatively different types 
of behaviour. They are also unstable, since nearby orbits behave qualita- 
tively differently. Here, the heteroclinic orbits separate motions where the 
pendulum swings back and forth from motions where it swings round and 
round (different types of periodic solution) . In terms of the phase portrait on 
a cylindrical surface, we can consider these paths to be homoclinic paths, 
since they connect an equilibrium point to itself.) 

We have now seen that, if we can determine the qualitative nature of the phase 
portrait of a second order nonlinear ordinary differential equation and sketch it, we 
can extract a lot of information about the qualitative behaviour of its solutions. 
This information provides rather more insight than the analytical solution, (9.4), 

f Greek hetero = different 
$ Greek homo = same 
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into the behaviour of the physical system that the equation models. For nonconser- 
vative equations, we cannot integrate the equation directly to get at the equation 
of the integral paths, and we have to be rather more cunning in order to sketch the 
phase portrait. We will develop methods to tackle second order, nonconservative, 
ordinary differential equations in Section 9.3. Before that, we will consider the 
simpler case of first order nonlinear ordinary differential equations. 


9.2 First Order Autonomous Nonlinear Ordinary Differential 
Equations 

An autonomous ordinary differential equation is one in which the independent 
variable does not appear explicitly, for example (9.2). The equations d+t 2 9 = 0 and 
x = t — x 2 are nonautonomous. Note, however, that an n th order nonautonomous 
ordinary differential equation can always be written as a (n+ l) th order system of 
autonomous ordinary differential equations. For example x = t — x 2 is equivalent 
to x = y — x 2 , y = 1 with y = 0 when t = 0. 

In this section we focus on the qualitative behaviour of solutions of first order, 
autonomous, ordinary differential equations, which can always be written as 

d *-=± = X(x ), (9.5) 

with X(x) a given function of x. Of course, such an equation is separable, with the 
solution subject to x(O) = Xq given by 

r dx ' 

i, o X&)' 

As we found in the previous section, solving the equation analytically is not nec- 
essarily the easiest way to determine the qualitative behaviour of the system (see 
Exercise 9.2). 


9.2.1 The Phase Line 

Consider the graph of the function X(x). An example is shown in Figure 9.3(a). 
If X(xi) = 0, then x = 0, and hence x = x\ is an equilibrium solution of (9.5). 
For the example shown in Figure 9.3 there are three equilibrium points, at x = aq, 
x 2 and X 3 . We can also see that x = X(x) <0, and hence a; is a decreasing 
function of t for x < aq and x 2 < x < £3. Similarly, x = X(x) > 0, and hence 
x increases as t increases, for xi < x < x 2 and x > X3. By analogy with the 
phase plane, where we constructed the phase portrait for a second order system 
in the previous section, we can draw a phase line for this first order equation, as 
shown in Figure 9.3(b). The arrows indicate whether x increases or decreases with 
t. Clearly, the different types of behaviour that are possible are rather limited by 
the constraint that the trajectories lie in a single dimension. Both for this example 
and in general, trajectories either enter an equilibrium point or head off to infinity. 
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In particular, periodic solutions are not possible. f Solutions that begin close to 
x = x\ or x = X3 move away from the equilibrium point, so that x — x\ and x = X3 
are unstable. In contrast, x = X2 is a stable equilibrium point, since solutions that 
start close to it approach it. 



Xi x 2 x 3 


Fig. 9.3. (a) An example of a graph of x = X(x). (b) The equivalent phase line. 


From Figure 9.3(b) we can see that 


x - 

-> —00 as t — * 00 

when xo < Xi, 

x - 

-> X2 as t — » 00 

when xi < xo < X3 

x - 

-> 00 as t — » 00 

when xq > X3. 


The set D_ oa = {x \ x < aq} is called the domain of attraction or basin of 
attraction of minus infinity. Similarly, D2 = {x | x\ < x < £3} is the domain of 
attraction of X2 and D ^ = {a; | x > £3} that of plus infinity. All of this information 
has come from a qualitative analysis of the graph of x = X(x). 


9.2.2 Local Analysis at an Equilibrium Point 

If x = x\ is an equilibrium solution of (9.5), what happens quantitatively close to 
x\ 1 Firstly, let’s move the equilibrium point to the origin by defining x = x — xi, 


f If there is some geometrical periodicity in the problem so that, for example, by analogy with 
the simple pendulum, — n x ix and x = — n is equivalent to x = 7r, we can construct a 
phase loop, around which the solution can orbit indefinitely. 
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so that 

dx 

di = x(xi + x> - 

We can expand X (xi + x) as a Taylor series, 

X(xi + x) = X(xi) + xX'(xi) + \x?X"(x\) + • • • , (9.6) 

where X' = dX/dx. In the neighbourhood of the equilibrium point, x <C 1, and 
hence 

(IT 

— ^X{x 1 ) + xX'{x 1 ). 
dt 

Since x = x\ is an equilibrium solution, X{x\) = 0, and, assuming that X'(x{) yf 0, 

^ w X'(xi)x for i<l. 
dt 

This has solution x = fcexp {X'(xi)t} for some constant k. If X'(x\) < 0, x — » 0 
(x — > xi) as t — > oo, and the equilibrium point is therefore stable. If X'(xi) > 0, 
x — > 0 (x — > xi) as t — > —oo, and the equilibrium point is therefore unstable. This 
is consistent with our qualitative analysis (note the slope of X (x) at the equilibrium 
points in Figure 9.3(a)). Moreover, we now know that solutions approach stable 
equilibrium points exponentially fast as t — > oo, a piece of quantitative information 
well worth knowing. 

If X'(xi) yitz 0, we say that x = xi is a hyperbolic equilibrium point, and 
this analysis determines how solutions behave in its neighbourhood. In particular, 
solutions that do not start at a hyperbolic equilibrium point cannot reach it in a 
finite time. If X'(xi) = 0, x = Xi is a nonhyperbolic equilibrium point, and 
we need to retain more terms in the Taylor expansion of X, (9.6), in order to sort 
out what happens close to the equilibrium point. We will consider this further in 
Chapter 13. 

9.3 Second Order Autonomous Nonlinear Ordinary Differential 
Equations 

Any second order, autonomous, ordinary differential equation can be written as a 
system of two first order equations, in the form 

x = X(x,y), y = Y(x,y). (9.7) 

For example, consider (9.2), which governs the motion of a simple pendulum. Let’s 
define x = 9 and y = x. Now, y = x = 9 = —(g/l) sin 0, and hence 

9 . 

x = y, y = — y smx. 

From now on, we will assume that the right hand sides in (9.7) are continu- 
ously differentiablef , and hence that a solution exists and is unique in the sense of 
Theorem 8.2, for all of the systems that we study. 

f Continuously differentiable functions are continuous and have derivatives with respect to the 
independent variables that are continuous. 
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9.3.1 The Phase Plane 

In Section 9.1, we saw how the solutions of the equations of motion of a simple 
pendulum can be represented in the phase plane. Let’s now consider the phase 
plane for a general second order system. 

An integral path is a solution (x(t),y(t)) of (9.7), plotted in the (x, y)-plane, or 
phase plane. The slope of an integral path is 

dy_ _ dy/dt _ y _ Y(x,y) 
dx dx/dt x X(x,y) 

This slope is uniquely defined at all points x = Xq and y = yo at which X{xo,yo) 
and Y(xo,yo ) are not both zero. These are known as ordinary points. Since 
the solution through such a point exists and is unique, we deduce that integral 
paths cannot cross at ordinary points. This is possibly the most useful piece of 
information that you need to bear in mind when sketching phase portraits. Integral 
paths cannot cross at ordinary points! 

Points (a>o,2/o) that are not ordinary have X(xo,yo) = Y(xo,yo) = 0, and are 
therefore equilibrium points. In other words, x = y = 0 when x = xq, y = yo , and 
this point represents an equilibrium solution of (9.7). At an equilibrium point, the 
slope, dy/dx, of the integral paths is not well-defined, and we deduce that integral 
paths can meet only at equilibrium points, and only as t — > ±oo. This will become 
clearer in the next section. 


9.3.2 Equilibrium Points 

In order to determine what the phase portrait looks like close to an equilibrium 
point, we proceed as we did for first order systems. We begin by shifting the origin 
to an equilibrium point at (xo,yo), using 


x = xq + x, y = 2 /o + y. 


Now we Taylor expand the functions X and Y, remembering that X{xo,yo) = 
Y(x 0 ,y 0 ) = 0, to obtain 


w , dX . , dX . 

X{x 0 + x,y 0 + y) = x—{x 0 ,yo) + y-^-{x 0 ,yo) + 1 


_d Y _d Y 

Y(x 0 +x, y 0 + y) = x—{x Q ,y 0 )+y—{x 0 ,yo) H . 

In the neighbourhood of the equilibrium point, and y <C 1, and hence 

dx 

— « xX x (x 0 ,y 0 ) + yXy(x 0 ,y 0 ), 


-^ttxY x (x 0 ,yo) + yYy(x 0 ,yo), (9.8) 

where we have used the notation X x = dX/dx. The most convenient way of writing 
this is in matrix form, as 


du 

dt 


J(x 0 ,y 0 )u, 


(9.9) 
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where 



J(xo,yo) 


Xx (Xo - Vo ) Xy(x 0 , 2 / 0 ) \ 

Y x (x 0 ,yo) Y v (x 0 ,yo) ) 


We call J(x o, yo) the Jacobian matrix at the equilibrium point (xq, yo). Equation 
(9.9) represents a pair of linear, first order ordinary differential equations with 
constant coefficients, which suggests that we can look for solutions of the form 
u = Uoe At , where Uo and A are constants to be determined. Substituting this form 
of solution into (9.9) gives 

Au 0 = Juq. (9.10) 


This is an eigenvalue problem. There are two eigenvalues, A = Ai and A = A 2 , 
possibly equal, possibly complex, and corresponding unit eigenvectors Uo = tq and 
u 0 = u 2 . These provide us with two possible solutions of (9.9), u = Uie Alt and 
u = u 2 e Aat , so that the general solution is the linear combination 

u = AiUie Alt + A 2 u 2 e A2t , (9.11) 

for any constants Ai and A 2 . 

When the eigenvalues Ai and A 2 are real, distinct and nonzero, so are the eigen- 
vectors Ui and u 2 , and (9.11) suggests that the form of the solution is simpler if 
we make a linear transformation so that Ui and u 2 lie along the coordinate axes. 
Such a transformation is given by u = Pv, where v = (x, y) and P = (ui, u 2 ) is a 
matrix with columns given by Ui and u 2 . Substituting this into (9.9) gives 


Pv = ./Pv, 


v = P 1 JPv = Av, 


where 


A = 


Ai 0 A 

0 a 2 j 


Therefore x = AiX, y = A 2 y, so that 

x = kie Xlt , y = k 2 e X2t , 
with k\ and k 2 constants, and hence 

y = k 3 x X2/Xl , 


(9.12) 

(9.13) 


where ^’3 = k X2 ^ Xl /k 2 . This is the equation of the integral paths in the transformed 
coordinates. Note that the transformed coordinate axes are integral paths. There 
are now three main cases to consider. 


(i) Distinct, real, positive eigenvalues (Ai,A 2 > 0) The solution (9.12) 
shows that x,y — > 0 ast— >— 00 , exponentially fast, so the equilibrium point 
is unstable. The equation of the integral paths, (9.13), then shows that they 
all meet at the equilibrium point at the origin. The local phase portrait is 
sketched in Figure 9.4, in both the transformed and untransformed coordi- 
nate systems. This type of equilibrium point is called an unstable node. 
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Remember, this is the phase portrait close to the equilibrium point. As the 
integral paths head away from the equilibrium point, the linearization that 
we have performed in order to obtain (9.9) becomes inappropriate and we 
must consider how this local phase portrait fits into the full, global picture. 
Figure 9.4(a) illustrates the situation when A 2 > Ai. This means that y 
grows more quickly than x as t increases, and thereby causes the integral 
paths to bend as they do. Figure 9.4(b) shows that when A 2 > Ai, the 
solution grows more rapidly in the 112 -direction than the Ui-direction. 




Fig. 9.4. An unstable node with A 2 > Ai sketched in (a) the transformed and (b) the 
untransformed coordinate systems. 


(ii) Distinct, real, negative eigenvalues (Ai,A 2 < 0) In this case, all we 
have to do is consider the situation with the sense of t reversed, and we 
recover the previous case. The situation is as shown in Figure 9.4, but with 
the arrows reversed. This type of equilibrium point is called a stable node. 

(iii) Real eigenvalues of opposite sign (A 1 A 2 < 0) In this case, the coordinate 
axes are the only integral paths in the (x, y)-plane that enter the equilibrium 
point. On the other integral paths, given by (9.13), x — > ±00 as y — * ► 0, and 
vice versa, as shown in Figure 9.5(a). When A 2 > 0 > Ai, x — > 0 and 
y — » ±00 as t — > 00 . The integral paths in the directions of the eigenvectors 
Ui and U 2 , shown in Figure 9.5(b), are therefore the only ones that enter the 
equilibrium point, and are called the stable and unstable separatricesf . 
This type of equilibrium point is called a saddle point. The separatrices 
of saddle points usually represent the boundaries between different types 
of behaviour in a phase portrait. See if you can spot the saddle points in 
Figure 9.2(b). Remember, although the separatrices are straight trajectories 


f singular form, separatrix. 
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in the neighbourhood of the saddle point, they will start to bend as they 
head away from it and become governed by the full nonlinear equations. 


(a) 


9 



(b) y 



Fig. 9.5. A saddle point with A 2 > 0 > Ai sketched in (a) the transformed and (b) the 
untransformed coordinate systems. 


The degenerate case where Ai = A 2 is slightly different from that of a stable 
or unstable node, and we will not consider it here, but refer you to Exercise 9.4. 
However, it should be clear that when Ai = A 2 > 0 the equilibrium point is unstable, 
whilst for Ai = A 2 < 0 it is stable. 

Let’s now consider what the phase portrait looks like when the eigenvalues Ai 
and A 2 are complex. Since the eigenvalues are the solutions of a quadratic equation 
with real coefficients, they must be complex conjugate, so we can write A = a ± i/3, 
with a and (3 real. The general solution, (9.11), then becomes 

u = e at (n lUl e i/3 ‘ + A 2 u 2 e~ i0t ) . (9.14) 

There are two cases to consider. 

(i) Eigenvalues with strictly positive real part (a > 0) The term e at 
in (9.14) means that the solution grows exponentially with t, so that the 
equilibrium point is unstable. The remaining term in (9.14) is oscillatory, 
and we conclude that the integral paths spiral away from the equilibrium 
point, as shown in Figure 9.6. This type of equilibrium point is called an 
unstable spiral or unstable focus. To determine whether the sense of 
rotation is clockwise or anticlockwise, it is easiest just to consider the sign 
of dy/dt on the positive x-axis. When y = 0, (9.8) shows that dy/dt = 
X x (xo, yo)x, and hence the spiral is anticlockwise if X x (xo , yo) > 0, clockwise 

if X x (x 0 ,y 0 ) < O.f 

f If X x (xo,yo) = 0, try looking on the positive y- axis. 
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Fig. 9.6. An unstable, anticlockwise spiral. 


(ii) Eigenvalues with strictly negative real part (a < 0) This is just the 
same as the previous case, but with the sense of t reversed. This is a stable 
spiral or stable focus, and has the phase portrait shown in Figure 9.6, 
but with the arrows reversed. 

Note that, for complex conjugate eigenvalues, we can transform the system into a 
more convenient form by defining the transformation matrix, P, slightly differently. 
If we choose either of the eigenvectors, for example ui, and write Ai = n + iu>, we 
can define 


P = (Im(u 1 ),Re(u 1 )) , 

and u = Pv. As before, v = P _1 JPv, but now 

JP = J(Im(u 1 ),Re(u 1 )) = (Im(Ju 1 ),Re(Ju 1 )) = (Im(Aiu 1 ),Re(Aiu 1 )) 


= (/rlm(u!) + wRe(u 1 ), /rRe(u 1 ) — wimju!)) = P 


A* 

U) 


—u> 


and hence 


P~ 1 JP = 


UJ 


—to 
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If v = (ui,r> 2 ) T , then 

V 1 = /IV 1 - U)V 2, V 2 = UWi + p,V 2, 

and, eliminating V 2 , 

hi — 2/zhi + (w 2 + /i 2 ) = 0. 

This is the usual equation for damped simple harmonic motion, with natural fre- 
quency uj and damping given by — /i. 

Note that all of the above analysis is rather informal, since it is based on a simple 
linearization about the equilibrium point. Can we guarantee that the behaviour 
that we have deduced persists when we study the full, nonlinear system, (9.7)? 
Yes, we can, provided that the equilibrium point is hyperbolic. We can formalize 
this in the following theorem, which holds for autonomous nonlinear systems of 
arbitrary order. 

Theorem 9.1 (Hartman Grobman) If x is a hyperbolic equilibrium point of the 
n th order system x = f(x), then there is homeomorphism (a mapping that is one-to- 
one, onto and continuous and has a continuous inverse) from R” to R" defined in 
a neighbourhood of x that maps trajectories of the nonlinear system to trajectories 
of the local linearized system. 

We will not give the proof, but refer the interested reader to the original papers by 
Hartman (1960) and Grobman (1959). 

Each of the five cases we have discussed above is indeed an example of a hyper- 
bolic equilibrium point. If at least one eigenvalue has zero real part, the equilibrium 
point is said to be nonhyperbolic. As we shall see in Chapter 13, the behaviour in 
the neighbourhood of a nonhyperbolic equilibrium point is determined by higher 
order, nonlinear, terms in the Taylor expansions of X and Y . 

An important example where it may not be necessary to consider higher order 
terms in the Taylor expansions is when the eigenvectors are purely imaginary, with 
A = ±ij3. In this case, (9.14) with a = 0 shows that the solution is purely oscillatory. 
The integral paths are closed and consist of a set of nested limit cycles around the 
equilibrium point. This type of equilibrium point is called a centre. However, the 
effect of higher order, nonlinear, terms in the Taylor expansions may be to make 
the local solutions spiral into or out of the equilibrium point. We say that the 
equilibrium point is a linear centre, but a nonlinear spiral. On the other hand, 
there are many physical systems where a linear centre persists, even in the presence 
of the higher order, nonlinear terms, and is called a nonlinear centre. 

As an example, consider the simple pendulum, with 


and J(0,0)=^_° g * ^ . 


x = y, 


J = 


0 


1 


— f cos x 0 


This has 
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The eigenvalues of J(0, 0) are A = ±i\J g/l. As we can see in Figure 9.2, the phase 
portrait close to the origin is indeed a nonlinear centre. What is it about this system 
that allows the centre to persist in the full nonlinear analysis? Well, the obvious 
answer is that there is a conserved quantity, the energy, which parameterizes the set 
of limit cycles. However, another way of looking at this is to ask whether the system 
has any symmetries. Here we know that if we map x i— > —x and 1 1 — > —t, reflecting 
the phase portrait in the y - axis and reversing time, the equations are unchanged, 
and hence the phase portrait should not change. Under this transformation, a stable 
spiral would become an unstable spiral, and vice versa, because of the reversal of 
time. Therefore, since the phase portrait should not change, the origin cannot be 
a spiral and must be a nonlinear centre. 


9.3.3 An Example from Mechanics 

Consider a rigid block of mass M attached to a horizontal spring. The block 
lies flat on a horizontal conveyor belt that moves at speed U and tries to carry the 
block away with it, as shown in Figure 9.7. From Newton’s second law of motion 
and Hooke’s law, 


Mx = F(x) — k(x — x e ), 


where x is the length of the spring, x e is the equilibrium length of the spring, k is 
the spring constant, F{x) is the frictional force exerted on the block by the conveyor 
belt, and a dot denotes d/dtt. We model the frictional force as 


F(x) 


F 0 for x <U, 
—F 0 for x > U, 


with Fq a constant force. When x = U, the block moves at the same speed as 
the conveyor belt, and this occurs when k\x — x e \ < Fq. In other words, the force 
exerted by the spring must exceed the frictional force for the block to move. This 
immediately gives us a solution, x = U for x e — Fo/k ^ x ^ x e + Fo/k. 



x 


Fig. 9.7. A spring-mounted, rigid block on a conveyor belt. 
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Our model involves five physical parameters, M, k, x e , F 0 and U. If we now 
define dimensionless variables x = x/x e and t = t/ \J M /k, we obtain 


x = F — x + 1, 


(9.15) 


where 


F(x) 


F 0 for x <U, 
—Fq for x > U. 


(9.16) 


We also have the possible solution x = U for 1 — Fq ^ x ^ 1 + Fo- There are now 
just two dimensionless parameters, 



We can now write (9.15) as the system 


x = y, y = F(y)-x+l. 


(9.17) 


We have left out the overbars for notational convenience. This system has a single 
equilibrium point at x = 1 + Fo, y = 0. Since y = 0 < U, the system is linear in 
the neighbourhood of this point, with 


x 

y 


0 1 

-1 0 


x 

y 


0 V 

*0 + 1 ) 


The Jacobian matrix has eigenvalues ±i, so the equilibrium point is a linear centre. 
In fact, since x {x — 1 — F) + yy = 0, we can integrate to obtain 


{x — (1 + Fo)} 2 + y 2 = constant for y <U, 
{a: — (1 — F 0 )} 2 + y 2 = constant for y > U. 


(9.18) 


The solutions for y ^ U are therefore concentric circles, and we conclude that the 
equilibrium point remains a centre when we take into account the effect of the 
nonlinear terms. The phase portrait is sketched in Figure 9.8. 

We now need to take some care with integral paths that meet the line y = U . 
Since the right hand side of (9.17) is discontinuous at y = U, the slope of the 
integral paths is discontinuous there. For x < 1 — F 0 and x > 1 + F 0 , trajectories 
simply cross the line y = U . However, we have already seen that the line y = U 
for 1 — F 0 < x < 1 + F 0 is itself a solution. We conclude that an integral path that 
meets y = U with x in this range follows this trajectory until x = 1 + F 0 , when 
it moves off on the limit cycle through the point D in Figure 9.8. For example, 
consider the trajectory that starts at the point A. This corresponds to an initially 
stationary block, with the spring stretched so far that it can immediately overcome 
the frictional force. The solution follows the circular trajectory until it reaches 
B. At this point, the direction of the frictional force changes, and the solution 
follows a different circular trajectory, until it reaches C. At this point, the block is 
stationary relative to the conveyor belt, which carries it along with it until the spring 
is stretched far enough that the force it exerts exceeds the frictional force. This 
occurs at the point D. Thereafter, the solution remains on the periodic solution 
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through D. On this periodic solution, the speed of the block is always less than 
that of the conveyor belt, so the frictional force remains constant, and the block 
undergoes simple harmonic motion. 



Fig. 9.8. The phase portrait for a spring-mounted, rigid block on a conveyor belt. 


9.3.4 Example: Population Dynamics 

Consider two species of animals that live on an island with populations X{t ) 
and Y(t). If P is a typical size of a population, we can define x(t) = X(t)/P and 
y(t) = Y(t)/P as dimensionless measures of the population of each species, and 
regard x and y as continuous functions of time, t. A model for the way in which 
the two species interact with each other and their environment is 

x = x (A + a\x + biy) , y = y (B + b 2 x + a 2 y) , (9.19) 

where A, B , or, b\ , 02 and 62 are constants. We can interpret each of these equations 
as 

Rate of change of population = Present population x (Birth rate — Death rate) . 

Let’s now consider what each of the terms that model the difference between the 
birth and death rates represents. If A > 0, the population of species x grows when 
x <C 1 and y <C 1, so we can say that x does not rely on eating species y to survive. 
In contrast, if A < 0, the population of x dies out, and therefore x must need to 
eat species y to survive. The term a\x represents the effect of overcrowding and 
competition for resources within species x, so we require that a\ < 0. The term 
b\y represents the interaction between the two species. If species x eats species y, 
b\ > 0, so that the more of species y that is available, the faster the population of 
x grows. If species y competes with x for the available resources, &i < 0. 
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We will consider two species that do not eat each other, but compete for re- 
sources, for example sheep and goats (see Exercise 9.10 for an example of a predator- 
prey system). Specifically, we will study the typical system 

x = a;(3 — 2x — 2 y), y = y(2 — 2x — y). (9.20) 

We will use this system to illustrate the full range of techniques that are available 
to obtain information about the phase portrait of second order systems. The first 
thing to do is to determine where any equilibrium points are, and what their types 
are. This will tell us almost everything we need to know in order to sketch the 
phase portrait, as integral paths can only meet at equilibrium points. Equilibrium 
points are the main structural features of the system, and the integral paths are 
organized around them. 

At equilibrium points, x = y = 0, and hence 

(x = 0 or 2x + 2y = 3) and (y = 0 or 2x + y = 2) . 

The four different possibilities show that there are four equilibrium points, P\ = 
(0, 0), Pi = (0, 2), P 3 = (3/2, 0) and P 4 = (1/2, 1). The Jacobian matrix is 


( X x 

Xy ' 

\ - I 

' 3 — 4x — 2y —2x 

'v T x . 


I ~ 1 

^ —2 y 2 — 2 x — 


and hence 

j m=(o °), 

j(3/2,o = f “K Zi ), .'(1/2,1)= (" :[ 

Three of these matrices have at least one off-diagonal element equal to zero, so 
their eigenvalues can be read directly from the diagonal elements. Pi has eigenval- 
ues 3 and 2 and is therefore an unstable node. P2 has eigenvalues —1 and —2 and 
is therefore a stable node. P3 has eigenvalues —3 and —1 and is therefore also a 
stable node. We could work out the direction of the eigenvectors for these three 
equilibrium points, but they are not really important for sketching the phase por- 
trait. The final equilibrium point, P4, has eigenvalues A = A± = —1 ± \/2. Since 
■\/2 > 1, P4 has one positive and one negative eigenvalue, and is therefore a saddle 
point. For saddle points it is usually important to determine the directions of the 
eigenvectors, since these determine the directions in which the separatrices leave 
the equilibrium point. These separatrices are the only points that meet P4, and 
we will see that determining their global behaviour is the key to sketching the full 
phase portrait. The unit eigenvectors of P4 are u± = (^/l/3, =F \/2/3) T . 

Next, we can consider the nullclines. These are the lines on which either x = 0 
or y = 0. The vertical nullclines, where x = 0 (only y is varying so the integral 
paths are vertical on the nullcline), are given by 

x = 0 or 2x + 2y = 3. 
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The horizontal nullclines, where y = 0, are given by 


y = 0 or 2x + y = 2. 


In general, the nullclines are not integral paths. However, in this case, since x = 0 
when x = 0 and y = 0 when y = 0, the coordinate axes are themselves integral 
paths. This is clear from the biology of the problem, since, if species x is absent 
initially, there is no way for it to appear spontaneously, and similarly for species y. 

All of the information we have amassed so far is shown in Figure 9.9(a). We 
will confine our attention to the positive quadrant, because we are not interested 
in negative populations. Moreover, the fact that the coordinate axes are integral 
paths prevents any trajectory from leaving the positive quadrant. Note that the 
equilibrium points lie at the points where the nullclines intersect. The directions 
of the arrows are determined by considering the signs of x and y at any point. 
For example, on the a;-axis, x = a;(3 — 2.x), so that x > 0, which means that x is 
increasing with t, for 0 < x < 3/2, and x < 0, which means that x is decreasing 
with t, for x > 3/2. From all of this local information, and the fact that integral 
paths can only cross at equilibrium points, we can sketch a plausible and consistent 
phase portrait, as shown in Figure 9.9(b). Apart from the stable separatrices of 
the saddle point P4, labelled S\ and S 2 , all trajectories asymptote to either P 2 or 
P3 as t — > 00 . These separatrices therefore represent the boundary between two 
very different types of behaviour. Depending upon the initial conditions, either 
species x or species y dies out completely. Although neither species eats the other, 
by competing for the same resources there can only be one eventual winner in the 
competition to stay alive. Although this is obviously a very simple model, the 
displacement of the native red squirrel by the grey squirrel in Britain, and the 
extinction of Neanderthal man after the arrival of modern man in Europe, are 
examples of situations where two species competed for the same resources. It is 
not necessary that one of the species kills the other directly. Simply a sufficient 
advantage in numbers or enough extra efficiency in exploiting natural resources 
could have been enough for modern man to consign Neanderthal man to oblivion. 

Returning to our simple model, there are some questions that we really ought 
to answer before we can be confident that Figure 9.9(b) is an accurate sketch of 
the phase portrait of (9.20). Firstly, can we be sure that there are no limit cycle 
solutions? An associated question is, if we think that a system does possess a limit 
cycle, can we prove it? Secondly, since the position of the stable separatrices of P4 is 
so important, can we prove that we have sketched them correctly? More specifically, 
how do we know that S\ originates at Pi and that S% originates at infinity? Finally, 
what does the phase portrait look like far from the origin? We will consider some 
mathematical tools for answering these questions in the next four sections. We can, 
however, also test whether our phase portrait is correct by solving equations (9.19) 
numerically in MATLAB. We must first define the MATLAB function 
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Fig. 9.9. (a) Local information at the equilibrium points and the nullclines, and (b) the 
full phase portrait for the population dynamical system (9.20). 


function dy = population(t ,y) 
dy(l) = y(l)*(3-2*y(l)-2*y(2)) ; 
dy(2) = y(2)*(2-2*y(l)- y(2)); 

dy=dy ’ ; 
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which returns a column vector with elements given by the right hand sides of 
(9.19). Then [t y] = ode45(@population, [0 10], [1 2] ) integrates the equa- 
tions numerically, in this case for 0 ^ t ^ 10, with initial conditions x = y(l) = 1, 
y = y(2) = 2. The results can then be displayed with plot(t,y). Figure 9.10 
shows the effect of holding the initial value of x fixed, and varying the initial value 
of y. For small enough initial values of y, y — > 0 as t — > oo. However, when the 
initial value of y is sufficiently large, the type of behaviour changes, and x — > 0 as 
t — > oo, consistent with the phase portrait sketched in Figure 9.9(b). 


x(0)=1 , y(0)=0 x(0)=1 , y(0)=1 



Fig. 9.10. The solution of (9.19) for various initial conditions. The dashed line is y, the 
solid line is x. 


9.3.5 The Poincare Index 

The Poincare index of an equilibrium point provides a very simple way of deter- 
mining whether a given equilibrium point or collection of equilibrium points can be 
surrounded by one or more limit cycles. 

Consider the second order system of nonlinear, autonomous ordinary differential 
equations given by (9.7). Let T be any smooth, closed, nonself-intersecting curve in 
the (x, y)-phase plane, not necessarily an integral path , that does not pass through 
any equilibrium points, as shown in Figure 9.11. At any point P lying on T, there 
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is a unique integral path through P, with a well-defined direction that we can 
characterize using the angle, ip(P), that it makes with the horizontal (tan ip(P) = 
Y/X). The angle ip varies continuously as P moves around T. If ip changes by an 
amount 2nr?r as P makes one complete, anticlockwise circuit around T, we call nr 
the Poincare index, which must be an integer. 



We can now deduce some properties of the Poincare index. 

(i) If r is continuously distorted without passing through any equilibrium points, 
nr does not change. 

Proof Since nr must vary continuously under the action of a continuous 
distortion, but must also be an integer, the Poincare index cannot change 
from its initial value. 

(ii) If r does not enclose any equilibrium points, nr = 0. 

Proof Using the previous result, we can continuously shrink T down to an 
arbitrarily small circle without changing nr- On this circle, ip(P) is almost 
constant, and therefore nr = 0. 

(iii) If r is the sum of two closed curves, then Ti and 1^, nr = nr 1 + «t 2 - 
Proof Consider the curves shown in Figure 9.12. On the curve where Ti 
and r 2 intersect, the amount by which ip varies on traversing Ti is equal 
and opposite to the amount by which ip varies on traversing T 2 , so these 
contributions cancel and nr = nr x + nr 2 - 

(iv) If r is a limit cycle then nr = 1. 

Proof This result is obvious. 

(v) If integral paths either all enter the region enclosed by T or all leave this 
region, nr = 1. 

Proof This result is also obvious. 
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Fig. 9.12. The sum of two curves. 


(vi) If r encloses a single node, focus or nonlinear centre, nr = 1. 

Proof For a node or focus, T can be continuously deformed into a small 
circle surrounding the equilibrium point. The linearized solution then shows 
that integral paths either all enter or all leave the region enclosed by T, and 
the previous result shows that nr = 1. For a nonlinear centre, T can be con- 
tinuously deformed into one of the limit cycles that encloses the equilibrium 
point, and result (iv) shows that nr = 1. 

(vii) If T encloses a single saddle point, ?rr = — 1. 

Proof r can be continuously deformed into a small circle surrounding the 
saddle point, where the linearized solution gives the phase portrait shown 
in Figure 9.5. The direction of the integral paths makes a single clockwise 
rotation as P traverses T, and hence nr = — 1. 


These results show that we can define the Poincare index of an equilibrium 
point to be the Poincare index of any curve that encloses the equilibrium point and 
no others. In particular, a node, focus or nonlinear centre has Poincare index n = 1, 
whilst a saddle has Poincare index n = — 1. Nonhyperbolic equilibrium points can 
have other Poincare indices, but we will not consider these here. Now result (iii) 
shows that nr is simply the sum of the Poincare indices of all the equilibrium points 
enclosed by T. Finally, and this is what all of this has been leading up to, result (iv) 
shows that the sum of the Poincare indices of the equilibrium points enclosed by a 
limit cycle must be 1. A corollary of this is that a limit cycle must enclose at least 
one node, focus or centre. 

Now let’s return to our example population dynamical system, (9.20). If a limit 
cycle exists, it cannot cross the coordinate axes, since these are integral paths. 
However, there is no node, focus or nonlinear centre in the positive quadrant of the 
phase plane, and we conclude that no limit cycle exists. 
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When a system does possess a node, focus or nonlinear centre, the Poincare 
index can be used to determine which sets of equilibrium points a limit cycle could 
enclose if it existed, but it would be useful to have a way of ruling out the existence 
of limit cycles altogether, if indeed no limit cycles exist. We now consider a way of 
doing this that works for many types of system. 


9.3.6 Bendixson’s Negative Criterion and Dulac’s Extension 

Theorem 9.2 (Bendixson’s negative criterion) For the second order system of 
nonlinear, autonomous ordinary differential equations given by (9.7), there are no 
limit cycles in any simply- connected region of the (x, y) -phase plane where X x + Y y 
does not change sign. 


Proof Let T be a limit cycle. For the vector field (X(x, y), Y ( x , y)), Stokes’ theorem 
states that 

S D {jt + %) dxdv= lf xdt - Ydx) ' 


where D is the region enclosed by T. However, we can write the right hand side of 
this as 

l( x ft- Yd i) dt =Jf XY - YX)dt = 0 ' 

and hence 




dx dy = 0. 


If X x + Y y does not change sign in D, then the integral is either strictly positive or 
strictly negative!, which is a contradiction, and hence the result is proved. □ 


Note that the restriction of this result to simply-connected regions (regions without 
holes) is crucial. 

If we now apply this theorem to the system (9.20), we find that X x + Y x = 
5 — 6x — 4 y. Although this tells us that any limit cycle solution must be intersected 
by the line 6x + Yy = 5, we cannot rule out the possibility of the existence of 
limit cycles in this way. We need a more powerful version of Bendixson’s negative 
criterion, which is provided by the following theorem. 


Theorem 9.3 (Dulac’s extension to Bendixson’s negative criterion) If 

p(x,y) is continuously differentiable in some simply -connected region D in the 
(. x,y)-phase plane, there are no limit cycle solutions if (pX) x + ( pY) y does not 
change sign. 


f ignoring the degenerate case where X x + Y y = 0 in D. 
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Proof Since 



(d{ P x) 

,d{ P Y)\ 

J Jo 1 

V dx 

dy J 


dx dy 


j p(Xdy-Ydx), 


the proof is as for Bendixson’s negative criterion. 


□ 


Returning again to (9.20), we can choose p = 1/xy and consider the positive 
quadrant where p(x,y) is continuously differentiable. We find that 


d{pX) d{pY) 
dx dy 


1 

x 



and hence that there can be no limit cycle solutions in the positive quadrant, as 
expected. 

Now that we know how to try to rule out the existence of limit cycles, how can 
we go about showing that limit cycles do exist for appropriate systems? We will 
consider this in the next section using the Poincare-Bendixson theorem. This will 
also allow us to prove that the stable separatrices of P4, shown in Figure 9.9, do 
indeed behave as sketched. 


9.3.7 The Poincare Bendixson Theorem 

We begin with a definition. Consider the second order system (9.7). If I + is 
a closed subset of the phase plane and any integral path that lies in I + when 
t = 0 remains in I + for all t ^ 0, we say that I + is a positively invariant set. 
Similarly, a negatively invariant set is a closed subset, I~ , of the phase plane 
and all integral paths in I~ when t = 0 remain there when t < 0. For example, an 
equilibrium point is both a positively and a negatively invariant set, as is a limit 
cycle. As a less prosaic example, any subset, S, of the phase plane with outward 
unit normal n that has (A, Y) • n ^ 0 on its boundary, so that all integral paths 
enter S, is a positively invariant set. 

Theorem 9.4 (Poincare-Bendixson) If there exists a bounded, invariant region, 
I, of the phase plane, and I contains no equilibrium points, then I contains at least 
one limit cycle. 

Note that it is crucial that the region I be bounded, but, in contrast to Bendix- 
son’s negative criterion, I does not need to be simply-connected. As we shall see, 
if I contains a limit cycle, it cannot be simply-connected. The Poincare-Bendixson 
theorem says that the integral paths in I cannot wander around for ever without 
asymptoting to a limit cycle. Although this seems obvious, we shall see in Chap- 
ter 15 that the Poincare-Bendixson theorem does not hold for third and higher 
order systems, in which integral paths that represent chaotic solutions can indeed 
wander around indefinitely without being attracted to a periodic solution. 

The details of the proof are rather technical, and can be omitted on first reading. 
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Proof of the Poincare-Bendixson Theorem 

We will assume that / is a positively invariant region. The proof when / is negatively 
invariant is identical, but with time reversed. We now need two further definitions. 

Let x(f: xo) be the solution of (9.7) with x = x 0 when t = 0. We say that xi 
is an w-limit point of Xo if there exists a sequence, {U}, such that tj — > oo and 
x(ij; Xo) — > xi as * — > oo. For example, all the points on an integral path that enters 
a stable node or focus at x = x e have x e as their only w-limit point. Similarly, all 
points on an integral path that asymptotes to a stable limit cycle as t — > oo have 
all the points that lie on the limit cycle as w-limit points. We call the set of all 
w-limit points of x 0 the w-limit set of x 0 , which is denoted by w(x 0 ). 

Let L be a finite curve such that all integral paths that meet L cross it in the 
same direction. We say that L is a transversal. If Xo is not an equilibrium point, 
it is always possible to construct a transversal through xo, since the slope of the 
integral paths is well-defined in the neighbourhood of x 0 . 


Lemma 9.1 Let I he a positively invariant region in the phase plane, and L C I a 
transversal through xo € I. The integral path, x(f; xo), that passes through xo when 
t = 0 intersects L in a monotone sequence. In other words, ifx.i is the point where 
x(t;xo) meets L for the i th time, then x$ lies in the segment of L between x*_i and 

x*+l . 


Proof Consider the invariant region D bounded by the integral path from x,_i to 
x,; and the segment of L between x,;_! and x,. There are two possibilities. Firstly, 
the integral path through x, enters D and remains there (see Figure 9.13(a)). In 
this case, x,; + i, if it exists, lies in D, and x, therefore lies in the segment of L 
between x*_i and x l+ i . Secondly, the integral path through x»_i originates in D 
(see Figure 9.13(b)). In this case, Xj_ 2 , if it exists, lies in D, and x,_.j therefore 
lies in the segment of L between x.;_ 2 and x,. □ 



Fig. 9.13. The intersection of an integral path in an invariant region with a transversal. 



9.3 SECOND ORDER EQUATIONS 


243 


Note that Lemma 9.1 does not hold in systems of higher dimension, since it relies 
crucially on the fact that a closed curve separates an inside from an outside (the 
Jordan curve theorem). 

Lemma 9.2 Consider the region I, transverse curve L and point xo £ I defined in 
Lemma 9.1. The u-limit set o/xo, w(xo), intersects L at most once. 

Proof Suppose that w(xo) intersects L twice, at x and x. We can therefore find 
sequences of points, {x,} and {x;}, lying in L such that {iq} — > x and {x,} — > x as 
i — > oo. However, this cannot occur, since Lemma 9.1 states that these intersections 
must be monotonic. Hence the result is proved by contradiction. □ 

Lemma 9.3 If xi € cc(xo) is not an equilibrium point and lies on x(t;x o), the 
integral path through xo, then x(£;xi), the integral path through xi, is a closed 
curve, also passing through xo- 

Proof Since xi lies in x(f;xo), w(xi) = w(xo), and hence xi G w(xi). Let L be 
a curve through Xi transverse to the integral paths. By Lemma 9.2, x(£;xi) can 
only meet L once, and hence is a closed curve. □ 

We can now finally prove Theorem 9.4, the Poincare-Bendixson theorem. 

Proof Let x 0 be a point in I, and hence x(f;x 0 ) C / and w(x 0 ) C I. Choose 
X! G w(x 0 ). 

If xi G x(£;xo), then, by Lemma 9.3, the integral path through xi is a closed 
curve and, since there are no equilibrium points in /, must be a limit cycle. 

If xi ^ x(t;x o), let X 2 G w(xi). Let L be a transversal through X 2 . Since 
X 2 G w(xi), the integral path though xi must intersect L at a monotonic sequence 
of points xij, such that x-|, — > X 2 as * — * oo. But x-i, G w(xo), so the integral path 
through xo must pass arbitrarily close to each of the points x-[, as t — » oo. However, 
the intersections of the integral path through Xo with L should be monotonic, by 
Lemma 9.1, and we conclude that x-i, = x 2 , and hence that w(xi) is the closed 
limit cycle through xi. □ 


Example 

Consider the system 

x = x — y — 2x(x 2 + y 2 ), y = x + y — y(x 2 + y 2 ). (9-21) 

In order to analyze these equations, it is convenient to write them in terms of 
polar coordinates, (r, 9). From the definitions, r = (x 2 + y 2 ) 1 / 2 and x = rcosO, 
y = rsinO, the chain rule gives 

r = (xx + yy)(x 2 + y 2 ) -1 ^ 2 = cos Ox + sin O y. 

For (9.21), 

r = cos 9 (r cos 9 — r sin 9 — 2 r 3 cos 6*) + sin 9 (r cos 9 + r sin 9 — r 3 sin 9) 


(9.22) 
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= r — r 3 (l + cos 2 9 ) . 

Since 0 < cos 2 9 < 1, we conclude that 

r(l — 2 r 2 ) < r < r( 1 — r 2 ). 

Therefore r > 0 for 0 < r < l/-\/2 and r < 0 for r > 1. Remembering that if r > 0, r 
is an increasing function of t, and therefore integral paths are directed away from the 
origin, we can define a closed, bounded, annular region D = {(r, 9) \ ro ^ r ^ tt} 
with 0 < ro < 1/v^ and r\ > 1, such that all integral paths enter the region /, as 
shown in Figure 9.14. If we can show that there are no equilibrium points in I , the 
Poincare-Bendixson theorem shows that at least one limit cycle exists in I. 



Fig. 9.14. The region I that contains a limit cycle for the system (9.21). 


From the definition of 9 = tan 1 (y/x), 

1 xy — yx 1 


9 = 


1 + ( y/x ) 2 x 2 r 


= - (cos 9 y — sin 9 x ) . 


(9.23) 


For (9.21), 

1 

r 


6 = — {cos 9 (r cos 9 + r sin 9 — r 3 sin 9) — sin 9 (r cos 9 — r sin 9 — 2 r 3 sin 9 ) } 

1 9 

= 1 + -r 2 sin 29. 
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Since —1 ^ sin 29 ^ 1, we conclude that 9 > 0 provided r < \J 2, and hence 
that there are no equilibrium points for 0 < r < \/2. Therefore, provided that 
1 < ri < s/2, there are no equilibrium points in /, and hence, by the Poincare- 
Bendixson theorem, there is at least one limit cycle in I. Note that we know that 
a limit cycle must enclose at least one node, focus or centre. In this example, there 
is an unstable focus at r = 0, which is enclosed by any limit cycles, but which does 
not lie within I. In general, if we can construct a region in the phase plane that 
traps a limit cycle in this way, it cannot be simply-connected. 

A corollary of the Poincare-Bendixson theorem is that, if I is a closed, bounded, 
invariant region of the phase plane, and I contains no limit cycles, we can de- 
duce that all of the integral paths that meet the boundary of I terminate at an 
equilibrium point contained within I. We can use similar ideas to determine the 
positions of the stable separatrices of P4 for our example population dynamical 
system, (9.20). 

Consider the region 

Ri = {{x, y) | x > 0, 0, 2x + y ^ 2, 2x + 2y < 3} , 

shown in Figure 9.15. The separatrix Si lies in Pi in the neighbourhood of the 
saddle point, P4. No integral path enters the region Pi through its boundaries, and 
we conclude that Pi must originate at the equilibrium point at the origin, Pi. 

Now consider the region 

P 2 = { (x, y) | x ^ 0, y ^ 0, 2x + y ^ 2, 2x + 2y ^ 3, x 2 + y 2 < rl } , 

with ro > 3, as shown in Figure 9.15. The separatrix P 2 lies in P 2 in the neigh- 
bourhood of the saddle point, P4. No integral path enters the region P 2 through 
any of its straight boundaries, and we conclude that Pi must enter P 2 through its 
curved boundary, x 2 + y 2 = 7q . Since we can make ro arbitrarily large, we conclude 
that P 2 originates at infinity. 


9.3.8 The Phase Portrait at Infinity 

In order to study the behaviour of a second order system far from the origin, 
it is often useful to map the phase plane to a new coordinate system where the 
point at infinity is mapped to a finite point. There are several ways of doing this, 
but the most useful is the Poincare projection. Consider the plane that passes 
through the line x = 1, perpendicular to the {x, y)-plane, as shown in Figure 9.16. 
We can set up a Cartesian coordinate system, (u,v), in this plane, with origin a 
unit distance above x = 1, y = 0, u-axis in the same direction as the y-axis and 
t>-axis vertically upwards. We label the origin of the (x, y)-plane as O, and denote 
the point a unit distance above the origin as O' . We now consider a point B in 
the (x, y)-plane. The Poincare projection maps the point B to the point C where 
the line O' B intersects the (u, u)-plane. In particular, as x — > 00, C approaches 
the w-axis, so that this projection is useful for studying the behaviour of a second 
order system for 1. 

In order to relate u and v to x and y, we firstly consider the triangles OAB and 
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Fig. 9.15. The regions Ri and R 2 for the population dynamical system (9.20). 

OA'B', shown in Figure 9.16, which are similar. This shows that u = y/x and 
\/ x 1 + y 1 — m = \J x 2 + y 1 jx. Secondly, we consider the similar triangles BB'C 
and BOO' , which show that (1 + v)/m = l/y/cc 2 + y 2 . By eliminating to, we find 
that v = —1/x. The simple change of variables u = y/x, v = — 1/a;, and hence 

u = uvx — vy, v = v 2 x, (9.24) 


is therefore a Poincare projection. 

If we apply this to the population dynamics example (9.20), we find that 


ii = — u ^1 -f — v = — (2u + 3v + 2). (9.25) 

This has finite equilibrium points at u = 0, v = —2/3, which corresponds to P 3 , and 
u = 2, v = —2, which corresponds to P 4 . The nature of these equilibrium points 
remains the same after the projection, so that P 3 is a stable node and P 4 is a saddle 
point. In order to determine the behaviour of (9.20) for x 1, we need to consider 
(9.25) close to the u-axis for v < 0 and u > 0. When — v <C 1, u 1, whilst v < 0. 
Integral paths close to the it-axis therefore start parallel to it, but then move away. 
Figure 9.17 is a sketch of the phase portrait in the (u, t>)-plane. We conclude that 
the integral paths of (9.20) for x 1 lead into the finite (x, y)-plane as sketched in 
Figure 9.9(b). The analogous transformation, u = x/y, v = — 1 /y is also a Poincare 
projection, and can be used to examine the behaviour for y 1 . 
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V 



Fig. 9.16. A Poincare projection. 


We have now answered all of the questions concerning the phase portrait of (9.20) 
that we asked at the end of Section 9.3.4. 


9.3.9 A Final Example: Hamiltonian Systems 

Let fi be a region of the (x, y)-plane and H : fi — > R. be a real-valued, continu- 
ously differentiable function defined on O. The two-dimensional system of ordinary 
differential equations, 


x = H y (x, y ), y = -H x (x, y), (9.26) 

is called a Hamiltonian system and H(x,y) is called the Hamiltonian. Such 
systems occur frequently in mechanics. One example is the simple pendulum, which 
we studied earlier. As we have seen, this has x = y = H y , y = — to 2 sin a; = —H x , 
so that H = ij/ 2 — u 2 cosx, the total energy, is a Hamiltonian for the system. 

Hamiltonian systems have several general properties, which we will now investi- 
gate. 

Theorem 9.5 The integral paths of a Hamiltonian system are given by H(x,y) = 
constant. 

Proof On an integral path (x(t),y(t)), H = H(x(t),y(t)) and 
— — - — H x x T Hy y — H x H y H y H x = 0, 


so that H is constant. 


□ 
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For the simple pendulum, the integral paths are therefore given by the curves 
y 2 — 2u> 2 cos x = constant. 

Theorem 9.6 If x = x e is an equilibrium point of a Hamiltonian system with 
nonzero eigenvalues, then it is either a saddle point or a centre. 


Proof The Jacobian of (9.26) at x = x e is 

H yx 

H xx dl x y 

Since H xy = H yx , the eigenvalues of J satisfy 

^ = H xy {x G ,yf) H xx {x G ,y e )Hyy{x e ,yf). 

Note that H 2 y ^ H xx H yy at x = x e since the eigenvalues are nonzero. If Hf y > 
H xx H yy at x = x e , there is one positive and one negative eigenvalue, so x e is 
a saddle point. If Hf y < H xx H yy at x = x e , there are two complex conjugate 
imaginary eigenvalues. Since the integral paths are given by H = constant and the 
conditions H x = H y = 0 and H xx H yy > Hf y are those for a local maximum or 
minimum at x = x e , the level curves of H are closed and surround x = x e . We 
conclude that the equilibrium point is a nonlinear centre. □ 
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Theorem 9.7 (Liouville’s theorem) Hamiltonian systems are area-preserving. 

Proof Consider a small triangle with one vertex at xo, the other two vertices at 
xo + 5xi and xo + < 5 x 2 , and |<5xi |, 6 x 2 <C 1. The area of this triangle is j Aj, where 

A = ^ (<5xi x <5x 2 ) . 

If all of the vertices move along integral paths of the Hamiltonian system (9.26), a 
Taylor expansion shows that 

^<5xi ~ (6xiH yx (x 0 ) + 6yiH y j/xo), -6xiH xx (x 0 ) - 6yiH xy (x 0 )) , 

for i = 1, 2, where <5x, ; = <5y.j). Up to Od^Xil), we therefore have 

dA d c . . d c 

2 — = — <5xi x 6x 2 + 6x i x - 776 x 2 
at at at 

= 6y 2 {6xiH yx (x 0 ) + 6yi H yy (x 0 )} + 6x 2 {6xiH xx (x 0 ) + 6y 1 H xy (x 0 )} 

—6yi {6x 2 H yx (x 0 ) + 6y 2 H yy (x 0 )} - 6x1 {6x 2 H xx (x 0 ) + 6y 2 H xy (x 0 )} = 0, 

so the area of the triangle is unchanged under the action of a Hamiltonian system. 
Since any area in the phase plane can be broken up into infinitesimal triangles, the 
Hamiltonian system is area-preserving. □ 


9.4 Third Order Autonomous Nonlinear Ordinary Differential 
Equations 

The solutions of the third order system 

x = X(x,y,z), y = Y(x,y,z), z = Z(x,y,z), (9.27) 

can be analyzed in terms of integral paths in a three-dimensional, (x, y , z)-phase 
space. However, most of the useful results that hold for the phase plane do not 
hold in three or more dimensions. The crucial difference is that, in three or more 
dimensions, closed curves no longer divide the phase space into two distinct regions, 
inside and outside the curve. For example, integral paths inside a limit cycle in the 
phase plane are trapped there, but this is not the case in a three-dimensional phase 
space. There is no analogue of the Poincare index or Bendixson’s negative criterion, 
nor, as we noted earlier, is there an analogue of the Poincare-Bendixson theorem. 
In third or higher order systems, integral paths can be attracted to strange or 
chaotic attractors, which have fractal or noninteger dimensions, and rep- 
resent chaotic solutions. A simple way to get a grasp of this is to remember 
that cars, which drive about on a plane, often hit each other, but aircraft, which 
have an extra dimension to use, do so more rarely. We will examine some ele- 
mentary techniques for studying chaotic solutions in Chapter 15. There are also 
some interesting, and useful, conservative third order systems for which a more 
straightforward analysis is possible (see also Exercise 13.4). 
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Example 

Consider the third order system 

x = yz, y = —xz, z = —k 2 xy. 


(9.28) 


In this rather degenerate system, any point on the x-, y- or z-axis is an equilibrium 
point. We can also see that 


d 

dt 


(x 2 + y 2 ) 


2xx ± 2 yy = 2xyz — 2yxz = 0. 


Therefore, x 2 + y 2 is constant on any integral path, and hence all integral paths lie 
on the surface of a cylinder with its axis pointing in the z-direction. 

Consider the integral paths on the surface of the cylinder x 2 + y 2 = 1. This 
surface is two-dimensional, so we should be able to analyze the behaviour of integral 
paths on it using phase plane techniques. There are equilibrium points at (0, ±1,0) 
and (±1,0,0) and the Jacobian matrix is 


This gives 


f X x 

Xy 

X z \ 


( 0 

z 

V \ 

Y x 

Yy 

Y z 


—z 

0 


V Zx 

Zy 

Z z j 


K ~k 2 y 

—k 2 x 

o ) 


±(± 1 , 0 , 0 ) 


( 0 

0 

0 \ 

( 0 

0 

±x \ 

° 

0 

Tl 

, •/((), : 1.0)= 0 

0 

0 

V o 

=F k 2 

o / 

V ±fc 2 

0 

0 / 


The points (±1,0,0) each have eigenvalues A = 0, ±fc. The zero eigenvalue, with 
eigenvector (0, 1, 0) T , corresponds to the fact that the y- axis is completely made up 
of equilibrium points. The remaining two eigenvalues are real and of opposite sign, 
and control the dynamics on the cylinder x 2 ± y 2 = 1, where the equilibrium points 
are saddles. Similarly, the points (0, ±1, 0) each have eigenvalues A = 0, ±ik, and 
are therefore linear centres on x 2 ± y 2 = 1. These remain centres when nonlinear 
terms are taken into account, using the argument that we described earlier for the 
simple pendulum, since the system is unchanged by the transformation z i— > — z, 
1 i— > —t. The phase portrait is sketched in Figure 9.18. 

We can confirm that this phase portrait is correct by noting that this system 
actually has two other conserved quantities. From (9.28), 

|(*V-* a ) = |(*V + * a ) = o, 

and hence k 2 y 2 — z 2 and k 2 x 2 + z 2 are constant on any integral path. Integral paths 
therefore lie on the intersection of the circular cylinder x 2 ± y 2 = constant, the 
hyperboloidal cylinder k 2 y 2 — z 2 = constant, and the elliptical cylinder k 2 x 2 + z 2 = 
constant. This is precisely what the phase portrait in Figure 9.18 shows. 

Finally, consider the integral path with x = 0, y = z = l when t = 0. On this 
integral path, k 2 x 2 ± z 2 = 1 and x 2 ± y 2 = 1, so that 


x = \/l — x 2 \/\ — k 2 x 2 , 
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Fig. 9.18. The phase portrait of the third order system (9.28) on the surface of the cylinder 
x 2 + y 2 = 1 when k = 2. 


which givesf 


+ _ [ X ds 

Jo — s 2 V 1 — k 2 s 2 

This is the definition of the Jacobian elliptic function sn(t; k). On this integral 
path y and 2 are also Jacobian elliptic functions, y = cn (t ; k) and 2 = dn(f ; k). The 
phase portrait that we have just determined now allows us to see qualitatively that 
these elliptic functions are periodic with t, provided that k 7 ^ 1. In Sections 12.2.3 
and 12.2.4 we will develop asymptotic expansions for sn(f ; k), firstly when k is close 
to unity, and secondly when k <C 1. The Jacobian elliptic functions will also prove 
to be useful in Section 12.2.5. 


Exercises 

9.1 Consider the second order, autonomous ordinary differential equation 

x = 3x 2 — 1 , 

where a dot represents d/dt. By integrating this equation once, obtain a 
relation between x and x. Sketch the phase portrait in the (x, i)-phase 
plane. Determine the coordinates of the two equilibrium points and show 
that there is a homoclinic orbit associated with one of them. What types 
of behaviour occur inside and outside the homoclinic orbit? 

f Note that the properties of the particular integral path with x = 0, y = z = l when t = 0 
ensure that the arguments of the square roots remain positive. 
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9.2 By sketching the curve x = X(x), determine the equilibrium points and 
corresponding domains of attraction when 

(a) X(x) = x 2 — x — 2, 

(b) X(x) = e~ x — 1, 

(c) X(x) = sinx. 

Now check that this qualitative analysis is correct by actually solving each 
equation with initial conditions x = Xq when t = 0. Which method do you 
think is easier to use, qualitative or quantitative? 

9.3 Find the eigenvalues and eigenvectors of the matrix A , and then sketch the 
phase portrait of the linear system u = Au, where u = ( x , y) T , for A = 



9.4 


Consider a second order linear system u = Au, when the constant matrix 
A has two equal eigenvalues, A. By using the Cayley-Hamilton theorem, 
show that there must exist a linear transformation that takes A to either 


Ar 


A 0 \ . 

0 A ) 0r Aa 


A 

0 



Solve the linear system of equations v = A^v for j = 1 and j = 2, and 
hence sketch the phase portrait in each case. Note that in the case j = 1, 
the equilibrium point at the origin is known as a star, whilst when j = 2 

it is an improper node. 

9.5 A certain second order autonomous system has exactly two equilibrium 
points, both of which are saddles. Sketch a phase portrait in which (a) a 
separatrix connects the saddle points, (b) no separatrix connects the saddle 
points. 

9.6 The weight at the end of a simple pendulum experiences a frictional force 
proportional to its velocity. Determine the equation of motion and write 
it as a pair of first order equations. Show that the equilibrium points are 
either stable points or saddle points. Sketch the phase portrait. What 
happens after a long time? 

9.7 Find all of the equilibrium points of each of the following systems, and 
determine their type. Sketch the phase portrait in each case. 

(a ) x = x ~ y, y = x + y — 2 xy 2 , 

(b) x = -3 y + xy - 10, y = y 2 - x 2 , 

(c) x = y 2 — 1, y = sinx. 

9.8 Consider the system of ordinary differential equations 

!=*py + i), | = 
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Determine the position and type of each equilibrium point in the ( x , y)- 
plane. Show that the coordinate axes are integral paths. Sketch the phase 
portrait, taking care to ensure that your sketch is consistent with the po- 
sition of the horizontal nullcline. If x = —1 and y = — 1 when t = 0, how 
does the solution behave as t — > oo? 

9.9 The second order system 

dr . . ds , . 

- = r ( 3-r-»), — = s(2 — r — s), 

can be used to model the population of sheep (s) and rabbits (r) in a closed 
ecosystem. Determine the position and type of all the equilibrium points. 
Find the directions of the separatrices close to any saddle points. Assuming 
that there are no limit cycles, sketch the phase portrait for r > 0 and s > 0. 
Which animal becomes extinct? 

9.10 Explain how the system 

x = x(-A + b 1 y), y = y(B-b 2 x), 

with the constants A , B , b\ and b 2 positive, models the populations of a 
carnivorous, predator species and its herbivorous prey in a closed ecosys- 
tem. Which variable is the predator population, x or yl Determine the 
type of each of the equilibrium points. Determine dy/dx as a function of 
x and y, and integrate once to obtain an equation of the form E(x, y) = 
constant. Show that E(x, y) has a local minimum at one of the equilibrium 
points, and hence deduce that it is a nonlinear centre. Sketch the phase 
portrait. What happens to the populations? 

9.11 Use the concept of the Poincare index to determine which of the following 
can be surrounded by a limit cycle in the phase portrait of a second order 
system. 

(a) an unstable node, 

(b) a saddle point, 

(c) two saddle points, a stable node and an unstable focus, 

(d) a saddle point, an unstable focus and a stable node. 

Sketch a possible phase portrait in each of the cases where a limit cycle 
can surround the equilibrium points. 

9.12 Consider the system 

^ = x(x 2 + y 2 - 1), ^ = y( x 2 + y 2 -2). 

Show that the x- and y - axes are integral paths. Show that there are no 
limit cycle solutions using 

(a) Dulac’s extension to Bendixson’s negative criterion with auxiliary 
function p(x,y) = 1/xy, 

(b) the Poincare index. 

Sketch the phase portrait. 
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9.13 


9.14 


9.15 


9.16 


Use the Poincare index, Bendixson’s negative criterion or Dulac’s exten- 
sion as appropriate to show that the following systems have no limit cycle 
solutions. 


(a) x = y, y = 1 + x 2 - (1 - x)y, 

(b) x = -(1 - a:) 3 + xy 2 , y = y + y 3 , 

(c) x = 2 xy + x 3 , y = —x 2 + y — y 2 + y 3 , 

(d) x = x, y = l + x + y 2 , 

(e) x = 1 — x 3 + y 2 , y = 2 xy, 

(f) x = y + x 2 , y = — x — y + x 2 + y 2 . 

(Hint: For (f), use Dulac’s extension to Bendixson’s negative criterion with 
auxiliary function p(x,y ) = e ax+by .) 

Write each of the following systems in terms of polar coordinates (r, 9), and 
use the Poincare-Bendixson theorem to show that at least one limit cycle 
solution exists. 


(a) x = 2x + 2 y — x(2x 2 + y 2 ), y = —2x + 2 y — y(2x 2 + y 2 ), 

(b) x=x-y- x(x 2 + | y 2 ), y = x + y- y(x 2 + \y 2 ). 

(a) Write the system 


/ r, 2 , 2\ , ( 2 , o 2\ 

— =x-y-(2x +y )x, - = x + y-( x +2 y )y 

at at 

in terms of polar coordinates, and then use the Poincare-Bendixson 
theorem to show that there is at least one limit cycle solution. 

(b) Use Dulac’s extension to Bendixson’s negative criterion (with an 
auxiliary function of the form e ax+by for some suitable constants a 
and b) to show that there is no limit cycle solution of the system 
with 


dx 

dt 


= y, 


dy 

dt 


= — x — y + x 2 + y 2 . 


(a) Write the system of ordinary differential equations 


dx 

dt 


= x — y — y 3 — 2a; 5 


2x V - xy A , 


-77 = x + y + xy 2 - 2yx 4 - 2 y 3 x 2 - y 5 
dt 

in terms of polar coordinates (r, 0), and use the Poincare-Bendixson 
theorem to show that at least one limit cycle solution exists. 

(b) Write the system of ordinary differential equations 

dx o o dy o q o 

-=xy-xy + y 3 , — =y + x 3 - xy~ 
dt dt 

in terms of polar coordinates, (r,9). Show that there is a single 
equilibrium point at the origin and that it is nonhyperbolic. Show 
that the lines 9 = ± 7 r /4 and 9 = ± 37 t /4 are integral paths. Show 
that dr/dt = 0 when 9 = 0 or n. Sketch the phase portrait. 
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9.17 


9.18 


Consider the second order system 


dp 

ds 


= -pq, 


dq 

ds 


= P+q- 2 , 


which arises in a thermal ignition problem (see Section 12.2.3). Show that 
there are just two finite equilibrium points, one of which is a saddle. Con- 
sider S, the stable separatrix of the saddle that lies in p > 0. Show that this 
separatrix asymptotes to the other equilibrium point as s — > — oo. {Hint: 
First show that S must meet the p-axis with p > 2. Next show that S 
must go on to meet the p-axis with p < 2. Finally, use the coordinate axes 
and S up to its second intersection with the p-axis to construct a trapping 
region for S.) 

Project A particle of mass m moves under the action of an attractive 
central force of magnitude 7 m/r a , where (r, 6) are polar coordinates and 
7 and a are positive constants. By using Newton’s second law of motion 
in polar form, show that u = 1/r satisfies the equation 


d 2 u 

Iff 2 


+ u — 


K 2 


,a-2 


(E9.1) 


where h is the angular momentum of the particle. 


(a) Find the equilibrium points in the (u,du/dd )- phase plane, and clas- 
sify them. What feature of the linear approximation will carry over 
to the solutions of the full, nonlinear system? 

(b) If the particle moves at relativistic speeds, it can be shown that 
(E9.1) is modified, in the case of the inverse square law, a = 2, 
appropriate to a gravitational attraction, to 


d 2 u 

Iff 2 


+ u = 


_7 

K 2 


+ eu 2 , 


(E9.2) 


where e is a small positive constant, and the term eu 2 is called Ein- 
stein’s correction. Find the equilibrium point that corresponds to 
a small perturbation of the Newtonian case (e = 0), and show that 
it is a centre. 

(c) Use MATLAB to solve (E9.2) numerically, and hence draw the phase 
portrait for the values of e, 7 and h appropriate to each of the 
three planets nearest to the Sun (you’ll need to find an appropriate 
astronomy book in your library), and relate what you obtain to part 
(b) above. 
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Group Theoretical Methods 


In this chapter we will develop an approach to the solution of differential equations 
based on finding a group invariant. In order to introduce the general idea, let’s 
begin by considering some simple, first order ordinary differential equations. 

The solution of the separable equation, 

= f(x)g(y ), 

dx 


is 


f dy f 

/ . . = / t(x)dx + constant. 

J g(y) J 


Another simple class of equations, often referred to as exact equations, takes the 
form 

cty cty dy 
dx dy dx 

In order to stress the equal role played by the independent variables in this equation, 
we will usually write 

d<t>, , d(j) 

Trdx + —dy = 0. 
ox ay 

This has the solution y) = constant. 

Let’s now consider the equation 

dy 

dx \ x I 

In general, this is neither separable nor exact, and we are stuck unless we can use 
some other property of the equation. An inspection reveals that the substitution 
x = Ax, y = A y, where A is any real constant, leaves the form of the equation 
unchanged, since 

dy = f (y 

dx \x 

We say that the equation is invariant under the transformation x i— > Ax, y i— > A y. 
The quantity y/x i— > y/x is also invariant under the transformation. If we use this 
invariant quantity, v = y/x, as a new dependent variable, the equation becomes 

dv 


f (-} 
J \xJ 
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which is separable, with solution 

dv f dx 

——t = / 1- constant. 

J(V)-V J X 

If we regard the parameter A as a continuous variable, the set of transformations 
x i— > Ax, y i— > Ay forms a group under composition of transformations. More 
specifically, this is an example of a Lie group, named after the mathematician 
Sophus Lie. We will discuss exactly what we mean by a group and a Lie group 
below. It is the invariance of the differential equation under the action of this 
group that allows us to find a solution in closed form. 

In the following sections, we begin by developing as much of the theory of Lie 
groups as we will need, and then show how this can be used as a practical tool for 
the solution of differential equations. 


10.1 Lie Groups 

Let D be a subset of R 2 on which x > x\ = f{x,y\e), y t— > yi = g(x,y-,e) is 
a well-defined transformation from D into R 2 . We also assume that x\ and jq 
vary continuously with the parameter e. This set of transformations forms a 

one-parameter group, or Lie group, if 

(i) the transformation with e = 0 is the identity transformation, so that 

f(x,y; 0) = X, g(x, y\ 0) = y, 

(ii) the transformation with — e gives the inverse transformation, so that, if X\ = 
f{x,y\e) and y 1 = g(x,y,e), then x = f(xi,yr, -e) and y = g{xi,yr, -e), 

(iii) the composition of two transformations is also a member of the set of trans- 
formations, so that if X\ = f{x,y\e), y\ = g(x,y;e), X 2 = f(xi,yi',8) and 
y -1 = g(xi,yi; 8), then x 2 = f(x, y;e + 8) and y 2 = g(x, y;e + 8). 

Some simple one-parameter groups are: 

(a) Horizontal translation, H(e): x\ = x + e, y\ = y, 

(b) Vertical translation, V(e): x\ = x, y\ = y + e, 

(c) Magnification, M(e): x\ = e e x, y\ = e e y, 

(d) Rotation, R(e): x± = x cose — y sine, tq = x sin e + y cos e. 

For example, to show that the set of transformations M(e) forms a group, firstly 
note that when e = 0, x± = x and y\ = y. Secondly, a simple rearrangement 
gives x = e~ e x\, y = e~ e yi, so that the inverse transformation is given by M(— e). 
Finally, if x 2 = e 6 X\ and y 2 = e 6 yi then x 2 = e 6 .e e x = e 8+e x, and similarly with 
Vi- 
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10.1.1 The Infinitesimal Transformation 

By defining a Lie group via the transformation X\ = f(x,y,e), y\ = g(x,y,e), 
we are giving the finite form of the group. Consider what happens when 1. 
Since e = 0 gives the identity transformation, we can Taylor expand to obtain 


xi = x + e 



If we now introduce the functions 


Vi = V + e 



Z(x,y) 



V(x,y) 



( 10 . 1 ) 


and just retain the first two terms in the Taylor series expansions, we obtain xi = 
x + e£(x, y ), j/i = y + er](x, y). This is called the infinitesimal form of the group. 
We will show later that every one-parameter group is associated with a unique 
infinitesimal group. 

For example, the transformation X\ = xcose — //sine, y± = a; sine + y cose 
forms the rotation group R(e). When e = 0 this gives the identity transformation. 
Using the approximations cos e = 1 + • • • , sin e = e + • • • for e C 1, we obtain the 
infinitesimal rotation group as x- L ~ x — ey, y\ ~ y + ex, and hence £(x,y) = —y and 
rj(x, y) = x. The transformation x\ = e e x, //i = e e y forms the magnification group 
M(e). Using e e = 1 + e + • • • for e < 1, we obtain the infinitesimal magnification 
group as ar ~ (1 + e)x, y\ ~ (1 + e)y, so that £(a:, y) = x and r/(x, y) = y. 

We will now show that every infinitesimal transformation group is similar, or 
isomorphic, to a translation group. This means that, by using a change of vari- 
ables, we can make any infinitesimal transformation group look like H(e) or V(e), 
which we defined earlier. Consider the equations that define £ and ?/ and write 
them in the form 


dx i , dyi 

~r = ?(*i,2/i)> ~r =v{xt,yi), 

ae de 

a result that is correct at leading order by virtue of the infinitesimal nature of the 
transformation, and which we shall soon see is exact. We can also write this in the 
form 


dx i _ dyi _ ^ 

£ v 

Integration of this gives solutions that are, in principle, expressible in the form 
Fi(xi,yi) = C\ and F 2 (a;i, 2 /i) = C 2 + e for some constants C\ and C 2 . Since 
e = 0 corresponds to the identity transformation, we can deduce that F\(xi,yi) = 
Fi(x,y) and F 2 (a;i, 2 /i) = F 2 (a:, 2 /) + e. This means that if we define u = F\{x,y) 
and v = -F 2 (x, y) as new variables, then the group can be represented by u± = u 
and V\ = v + e, so that the original group is isomorphic to the translation group 
V(e). 
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10.1.2 Infinitesimal Generators and the Lie Series 

Consider the change, 6(f>, that occurs in a given smooth function 4>{x, y) under 
an infinitesimal transformation. We find that 

( dcj) dcj) \ 

+7] dy) H ' 

If we retain just this single term, which is consistent with the way we derived the 
infinitesimal transformation, we can see that 6</> can be written in terms of the 
quantity 


or, in operator notation, 


TT , f dcj) dcj) 

u<b = £-^- +v^-, 

ox oy 


TT _,d 9 
~ ^dx +V dy' 


This is called the infinitesimal generator of the group. Any infinitesimal trans- 
formation is completely specified by U cj>. For example, if 

TT 1 d( P , dcj) 

Ucp= -y-c- + x—, 
ox oy 

t;(x,y) = —y and y(x,y) = x, so that the transformation is given by x\ = x — ey , 
yi = y + ex. From the definition (10.2), Ux = £ and Uy = y, so that 

deb deb 

Ucb=Ux^ + Uy 

dx dy 

and if a group acts on ( x,y ) to produce new values (aq,yi) then 

Ucb{xi,yi) = Ux 1 ^- + U yjjj—. 

dx i dy i 

Let’s now consider a group defined in finite form by X\ = /(a;, y ; e), yi = g(x, y\ e) 
and a function e/> = </>( a;, y). If we regard </>(aq, iq; e) as a function of e, with a prime 
denoting d/de, we find that 

<b(xi, yi ; e) = <b( x i, 2/1; °) + ^'(x 1 ,y 1 -, 0) + ^eV'^u yi; 0) h — . 


<P(xi,yi;0) = <b(x,y), 


<f>'{xi,yi;0) = 


deb dx 1 dip dyi 
dx 1 de dyi de 


= UflT- + ? ?WT 


f d(b deb 
=( di + , % = u4 ’’ 
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we have 

<p(xu yi; e) = $(x, y,0) + eU4>+ ^ e 2 c/ 2 ^) H . 

This is known as a Lie series, and can be written more compactly in operator 
form as 

<t>(xi,yv,e) = e eU (j)(x, y). 

In particular, if we take y , 0) = x, 

Xi = x + eUx + ^ e 2 U 2 x + ■ ■ ■ = x + e£ + ^ e 2 U £ + • • • 


Similarly, 


= x + e£ + 





1 

y 1=y + er] + - e 


2 




These two relations are a representation of the group in finite form. It should now be 
clear that we can calculate the finite form of the group from the infinitesimal group 
(via the Lie series) and the infinitesimal group from the finite form of the group 
(via expansions for small e). For example, if an infinitesimal group is represented 

by 


U<t> = X ^ + 

ox 


d(j) 
y dy 1 


then 


-L O -L 3 f 

x\ — x + ex + — e x + — x + • • • = xe , 

1 2 1 3 

2/i = V + ey + — } e y + — e y H = ye . 


The finite form is therefore M(e), the magnification group. 

As a further example, if an infinitesimal group is represented by 


U (j> = -y 


d(j> 

dx 


+ x 


dcj) 
dy ’ 


then 


Ux=-y , Uy = x , U 2 x = -x, U 2 y = -y, 
U 3 x = y, U 3 y = — x , U 4 x = x, U 4 y = y. 


U is therefore a cyclic operation with period 4 and the equations of the finite form 
of the group are 


1 2 1 3 1 4 

xi = x - ey - —e x + — e y + —e x H 
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= x ^1 - ^j-e 2 + ^e 4 ^ ~y (e- ^e 3 + "•) = x cos e - y sin e. 

Similarly, 

1 3 1 3 1 4 

yi=y + ex - —e y - —e x + -e y-\ 

= 1 / (i - ^,e 2 + ^e 4 ^ + x ^e- ^e 3 + •• = j/cose + xsin e, 

and we have the rotation group R(e). 

There is a rather more concise way of doing this, using the fact that 

dx i . . dyi . . 

-r- = ?(*i,2/i)> -j- = V{xi,y l), subject to x 1 = x and yi = y at e = 0 
de de 

is an exact relationship according to (10.1). For the first example above, this gives 
dxi/de = x\ and dyi/de = y\, with x = X\ and y = yi at e = 0. This first order 
system can readily be integrated to give X\ = xe e and y\ = ye e . 


10.2 Invariants Under Group Action 


Let X\ = f(x, y\ e), yi = g(x, y\ e) be the finite form of a group and let the infinites- 
imal transformation associated with the group have infinitesimal generator 


u<t> = t 


dcj) 

dx 


+ rj 


d(f) 

dy 


A function f l(x, y) is said to be invariant under the action of this group if, when x\ 
and i/i are derived from x and y by the operations of the group, fl(xi,yi) = f2(x, y). 
Using the Lie series, we can write 

fi(xi, y±) = fl(x, y) + eUtt + ^ e 2 U 2 Q H = ft(x, y) + eUtt + ^e 2 U(Ufl) + ■ ■ ■ , 


so that a necessary and sufficient condition for invariance is Ufl = 0, and hence 
that 


m 

dx 


dn 

dir = °- 

dy 


This is a partial differential equation for Q,, whose solution is Cl(x,y) = C, a con- 
stant, on the curve 


dx dy 
£ V ' 


(10.3) 


Since this equation has only one solution, it follows that a one-parameter group has 
only one invariant. 

Now let’s take a point (xo,yo) and apply the infinitesimal transformation to it, 
so that it is mapped to (a’o + e£, yo + &])■ If we repeat this procedure infinitely 
often, we can obtain a curve that is an integral of the differential system given by 
(10.3). By varying the initial point (xo,yo) we then obtain a family of curves, all 
of which are solutions of (10.3), which we denote by Qf(x,y) = C. This family of 
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curves is invariant under the action of the group in the sense that each curve in 
the family is transformed into another curve of the same family under the action 
of the group. Suppose (x,y) becomes {x\,yi) under the action of the group. This 
means that flf(x\,yi) = constant must represent the same family of curves. Using 
the Lie series we can write 

2 / 1 ) = Qf(x, y) + eUflf + ^e 2 U 2 fl f H . 

The most general condition that forces the first two terms to be constant is that 
Uflf = constant should represent one family of curves, because 

U n Cl f = U n ~ 1 (Uflf) = U™- 1 (constant) = 0. 


This is conveniently written as UQf = F(flf) for some arbitrary nonzero function 

F. 


For example, the rotation group is represented in infinitesimal form by 

rr / d(j) Scj) 

U <P = -Vjr + x ~^~- 
ox Oy 

The equation for the invariants of this group is 


dx dy 
V x ’ 

which can be easily integrated to give Cl = x 2 + y 2 = C . This gives the intuitively 
obvious result that circles are invariant under rotation! 


10.3 The Extended Group 


If x\ = f(x,y\e) and yi = g(x,y;e) form a group of transformations in the usual 
way, we can extend the group by regarding the differential coefficient p = dy/dx 
as a third independent variable. Under the transformation this becomes 



9x + pg y 

fx + Pfy 


h(x, y,p;e). 


It can easily be verified that the triple given by ( x,y,p ) forms a group under the 
transformations above. This is known as the extended group of the given group. 
This extended group also has an infinitesimal form associated with it. If we write 
x x = x + e£(x, y), y ± = y + ep(x, y), then 


_ erjx + p( 1 + £?7 y ) 

1 + tfix + 

Expanding this using the binomial theorem for small e, we find that 


Pi = P + e {dx + (Vy - £x) P - tyP 2 } =P + eC (10-4) 


The infinitesimal generator associated with this three-element group is 


U'(t> = £(!> x + iKt> y + C<V 
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It is of course possible, though algebraically more messy, to form further extensions 
of a given group by including second and higher derivatives. 

We are now at the stage where we can use the theory that we have developed 
in this chapter to solve first order differential equations. Prior to this, it is helpful 
to outline two generic situations that may occur when you are confronted with a 
differential equation that needs to be solved. 

(i) It is straightforward to spot the group of transformations under which the 
equation is invariant. In this case we can give a recipe for using this invari- 
ance to solve the equation. 

(ii) No obvious group of transformations can be spotted and we need a more sys- 
tematic approach to construct the group. This is called Lie’s fundamental 
problem and is considerably more difficult than (i). Indeed, no general so- 
lution is known to Lie’s fundamental problem for first order equations. 


10.4 Integration of a First Order Equation with a Known Group 
Invariant 


To show more explicitly that this group invariance property will lead to a more 
tractable differential equation than the original, let’s consider a general first order 
ordinary differential equation, F(x,y,p) = 0, that is invariant under the extended 
group 


TT f, ,9cf) d(f) 

U * = ( ^ + ' 1 dy 


+ c 


d(j) 

dp 


derived from 


u<t> = t 


dcj) 

dx 


+ rj 


d(f) 

dy‘ 


We have seen that a sufficient condition for the invariance property is that U'(j) = 0, 
so we are faced with solving 


dx 



+ <f = 0 . 

dp 


The solution curves of this partial differential equation, where <f> is constant, are 
the two independent solutions of the simultaneous system 


dx dy dp 

£ V C ’ 


Let u(x, y) = a be a solution of 


dx dy 
£ V ’ 

and v(p,x,y) = (3 be the other independent solution. 

We now show that if we know U , finding v is simply a matter of integration. To 
do this, recall the earlier result that any group with one parameter is similar to the 
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translation group. Let the change of variables from (x,y) to ( Xi,yi ) reduce U<fi to 
the group of translations parallel to the yi axis, and call the infinitesimal generator 
of this group Uif. Then 


U 1 f = Ux 1 ^~ 

OX i 


TTn 

Vl dyi dy i ’ 


from which we see that Ux 1 = 0 and Uy± = 1, or more explicitly, 


dxx dx 1 _ dyi dy 1 

— 0 ) 

ox ay ox oy 


= 1 . 


The first of these equations has the solution x\ = u( x, y) and the second is equiva- 
lent to the simultaneous system 


dx dy dy\ 
t V 1 


Again, one solution of this system is u(x , y) = a. This can be used to eliminate x 
from the second independent solution, given by 


dyi _ 1 

dy y 0 , 2 /)’ 


so that by a simple integration we can obtain yi as a function of x and y. As the 
extended group of translations, U[f, is identical to C/i/, the most general differential 
equation invariant under U[ in the new X \ , yi variables will therefore be a solution 
of the simultaneous system 


dx i dy i dpi 

~CT = 1 7 = IT' 


This particularly simple system has solutions Xi = constant and pi = constant, so 
that the differential equation can be put in the formp! = F(xi) for some calculable 
function F . In principle, it is straightforward to solve equations of this form, as 
they are separable. The solution of the original equation can then be obtained by 
returning to the ( x,y ) variables. 


The differential equation 


Example 

dy _ 1 

dx X + y 1 


is invariant under the transformation x = e~ 2e Xi, y = e~ e yi. The infinitesimal 
transformation associated with this is xi = x + 2ex, yi = y+ey , so that £(a:, y) = 2x 
and i)(x, y) = y. If we solve the system 

dx dy 
? V ’ 

we find that y/x 1 / 2 = e c , so that xi = y/x 1 / 2 . Solving 

dyi _ 1 _ 1 
dy y y 
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gives yi = log y. Some simple calculus then shows that the original differential 
equation transforms to 

1 3 dyi 

2 x dx i _ xi 
dx\ 

which, on rearranging, gives 

dyi 1 

dx 1 ” xx (l-±x ly /x[+l )‘ 

This is the separable equation that we are promised by the theory we have de- 
veloped. The final integration of the equation can be achieved by the successive 
substitutions z = log:ci and t 2 = 1 + e~ 2z . 


10.5 Towards the Systematic Determination of Groups Under Which 
a First Order Equation is Invariant 


If we consider a differential equation in the form 


% = F[x ’ t] 

and an infinitesimal transformation of the form x\ ~ x + e^(x,y), yi 
(10.4) shows that 


dyi _ dy g 
dx i dx 





y + er)(x,y), 


Using the differential equation to eliminate dy/dx, we find that the equation will 
be invariant under the action of the group provided that 
8F 8F 

= Vx + (Vy -Zx)F- ZyF 2 . (10.5) 

So, given the function F, the fundamental problem is to determine two functions 
£ and rj that satisfy this first order partial differential equation. Of course, this is 
an underdetermined problem and has no unique solution. However, by choosing a 
special form for either ^ or y there are occasions when the process works, as the 
following example shows. 


Consider the equation 


Example 


dy _ V 

dx x + x 2 + y 2 


dF — (1 + 2x) y dF x + x 2 — y 2 
dx (x ~\~ x 2 ~\~ y 2 ) 2 dy (a? + x 2 + y 2 ) 2 


(10.6) 


In this case, 
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and (10.5) takes the form 

(x + x 2 - y 2 )y - (1 + 2x)y£ = (x + x 2 + y 2 ) 2 y x + (■ i) x - & X )y(x + x 2 + y 2 ) - y 2 £ v . 

If now we choose r) = 1, this reduces to 

(x + x 2 - y 2 ) + (1 + 2 x)y£ = y(x + x 2 + y 2 )i x + y 2 ^ y . 

It is not easy to solve even this equation in general, but after some trial and error, 
we can find the solution / = x/y. The infinitesimal transformation in this case is 
x\ — x + ex/y, y± = y + e and, following the procedure outlined in the last section, 
we now solve 

dx dy 

x/y 1 

to obtain y/x = constant. We therefore take x\ = y/x and hence yi = y as our 
new variables. In terms of these variables, (10.6) becomes 

dyi _ 1 

dx i (1 + x 2 ) ' 

with solution y\ = — tan -1 x\ + C . The solution of our original differential equation 
can therefore be written in the form y + tan -1 (y/x) = C. 


10.6 Invariants for Second Order Differential Equations 


First order differential equations can be invariant under an infinite number of one- 
parameter groups. Second order differential equations can only be invariant under 
at most eight groups. To see where this figure comes from, let’s consider the simplest 
form of a variable coefficient, linear, second order, differential equation, 


Writing x± 


d 2 y 

dx 2 


+ q(x)y = 0. 


x + e£ and y\ = y + ey we have already shown that 

21 


dy i 
dx i 


dy_ 

dx 


Vx 


dy 




dx 


dy 


dx , 


Now 


and 


dy_ 

dx 


= -^ + e n 



= p+eU(x,y,p) . 


d 2 yi _ _d_ f dyi \ _ j/_ f ///U+\ ////_ 
dx 2 dx\ \dx\J dx \dx\ J dx \ 


d_ f dyi \ 
dx \dxi J 


dp 

dx 


+ e (n x + n.yp + HpPx) + ■ ■ ■ 


(10.7) 


= 1 



1 

1 + e (£x + iyP) 
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so that 


d 2 yi d 2 y 
dx 2 


dx 2 


e <1 II , 


n 


dy 


n, 


y dx ' p dx 2 
The condition for invariance is therefore 


d 2 y d 2 y 
dx 2 




dy , d 2 y 


d 2 y 


+ n !/3z + “7zl n p ~ ^ + ) + = °- 


v dx dx 2 p dx 2 
Some simple calculation leads to 

{dxx + (2£ x - n y ) q{x)y + £,y\x)y + q(x)rj} + {2y xy - £ 

2 / j . . \ 3 


3g(a;)j/^ y } 


dy 

dx 


It Of x f _ n 

z sa:yj l ) ^yy \ d x ) — U ‘ 


If we now set the coefficients of powers of dy/dx to zero, we obtain 
£ yy = O 5 Vyy ^^xy = 0? 2 TJ X y Crtc “1“ Sq^x'jy^y = 0? 


( 10 . 8 ) 


+ (2^s - rj y )q( x )y + £q'( x )y + q(x)y = o. (io.9) 

Equation (10.8) i can be integrated to give 

£ = p{x)y + ^(x). 

Substitution of this into (10.8)2 gives r] yy = 2p'(x), which can be integrated to give 

V = p\x)y 2 + fj(x)y + C(x). 

From (10.8)3, 

3 p{x)q{x)y + 3 p"{x)y + 2 fj'(x) — £"(x) = 0. (10.10) 

Finally, (10.9) gives 

{p'"(x)y 2 + rj"(x)y + C"(a;)} - q(x)y {fj{x) - 2^’{x)} + {p{x)y + £(x)} q\x)y 


+ {p\x)y 2 + fj(x)y + C(x)} q(x) = 0. (10.11) 

We can find a solution of (10.10) and (10.11) by noting that the coefficient of each 
power of y must be zero, which gives us four independent equations, 

("(x) + q(x) C(x) = 0, p"(x) + q(x)p(x) = 0, 


f/"(x) ~ q(x) (fj(x) - 2£'(x)) + £(x)q'(x) + fj(x)q(x) = 0, 


C(x) = 2fj'(x). 

At this stage notice that p(x) and £(x) satisfy the original second order equation, 
(10.7), the solution of which will involve four constants. There are also second 
order equations for £(x) and 77(21), the solution of which gives rise to a further four 
constants. Each of these constants will generate a one-parameter group that leaves 
the original equation invariant, so the original equation is invariant under at most 
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eight one-parameter groups. Some of these groups are obvious. For example, since 
(10.7) is invariant under magnification in y , it must be invariant under x± = x, 
Vi = e e y. 

Let’s consider the group generated by p{x) in more detail. In the usual way, we 
have to integrate 

dx dy 

€(x,y) v(x,y)’ 

which leads to the equation 

dy = y{p'(x)y+ ^£(x) + CO*’)} 
dx p(x)y + £,{x) 

This has a solution of the form y = Cp(x) provided that £( 2 :) = Cp 2 (x) and C(:r) = 0 
for some constant C. Note that fj(x) = \£,\x) by direct integration. This means 
that, following the ideas of the previous section, we should define X\ = y/p{x). 
Now, 


dy 1 _ 1 1 1 

dy y(x, y) p’{x)y 2 + Cpp’y p’{x) {y 2 + Cpy) ’ 


and, since y = C p{x), 


which gives 


y 1 


dy 1 _ 1 

dx ~ 2 Cp 2 ’ 


1 r x dx p(x) r x dt 

2C J X0 p 2 2 y J Xo p 2 {t)' 


We can write the differential equation in terms of these new variables by noting 
that 


so that 


Now 


dx 1 

d(xiyi) 
which gives 


d 

dx 


( dx 1 \ 

\d( xiyi)) 


xiyi = 


r dt 

L vw* 


d_ 

dx 


(xm) 


1 

2 p 2 (x) ' 


dx 1 dx 
dx d(xiyi) 


1 dy p'y 


p dx p 


o 2 j r y \ o / 

2 P :u--y =2 p— - py 


dy 


dx 


dx 2 


/dy 
° dx 


p"y 


/dy 

dx 


= 2 p 


dx 2 


+ q(x)y > = 0. 


Integrating this expression gives 

dx 1 

d(xiyi) 
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so that X\ = C 1 S 12/1 + C 2 or, in terms of the original variables, 

dt 


y=C 1 p(x)f -^- + C 2 p(x). 
Jx n P (t) 


'X 0 P 2 (t) 

This is just the reduction of order formula, which we derived in Chapter 1. Although 
the invariance property produces this via a route that is rather different from that 
taken in Chapter 1, it is not particularly useful to us in finding solutions. 

Let’s now consider the invariance x\ = x and y\ = e e y, which has £ = 0 and 
?/ = y. This means that 

dx dy 

T = 7’ 

and we can see that this suggests using the new variables X\ = x and yi = log y. 
Since y" = e Vl (■ y '{ +y' 2 ) under this change of variables, we obtain y " +y ,2 + q{ x) = 0. 
Putting Y\ = y[ leads to Y{ + Y 2 + q(x) = 0. This is a first order equation, so 
the group invariance property has allowed us to reduce the order of the original 
equation. As an example of this, consider the equation 


V 


In this case, we obtain 


4^ V = 0 - 


1 


This is a form of Ricatti’s equation, which in general is difficult to solve. The 
exception to this is if we can spot a solution, when the equation will linearize. For 
this example, we can see that Y\ = l/2x is a solution. Writing Y\ = l/2x + V -1 
linearizes the equation and it is straightforward to show that the general solution 
is y = C^ 1 ' 2 + C 2 x 1 / 2 \ogx. 

At this stage, you could of course argue that you could have guessed the solu- 
tion U\(x) = x 1 ! 2 and then reduced the order of the equation to obtain the second 
solution u 2 (x) = x 1//2 logx. The counter argument to this is that the group theo- 
retical method gives both the technique of reduction of order and an algorithm for 
reducing the original equation to a first order differential equation. The point can 
perhaps be reinforced by considering the nonautonomous equation 

y" + -y' + e v = 0. 
x 

Using the methods derived in this chapter, we first introduce a new dependent 
variable, Y, defined by y = — 2 log x + Y. This gives us 

1 e r 

Y" + -Y' + = 0. 

x x z 

The invariance of this under ^-magnification suggests introducing 2 = log x and 
leads to 
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which we can immediately integrate to 


1 

2 



= C — e 


Y 


The analysis of this equation is of course much simpler than the original one. 


10.7 Partial Differential Equations 


The ideas given in this chapter can be considerably extended, particularly to the 
area of partial differential equations. As a simple example, consider the equation 
for the diffusion of heat that we derived in Chapter 2, namely 


dT _ d 2 T 
~dt ~ 


(10.12) 


This equation is invariant under the group of transformations 


x = e e X\, t = e 2e t\, 


so that x/t 1 / 2 is an invariant of the transformation (can you spot which other groups 
it is invariant under?). If we write = x/t 1 ^ 2 (here is known as a similarity 
variable) , this reduces the partial differential equation to the ordinary differential 
equation 


D 


d 2 T 

drf 


1 dT 
2 V dr] 


= 0 . 


For the initial condition T(x, 0) = H(—x), which is also invariant under the trans- 
formation, it is readily established from the ordinary differential equation that the 
solution is 


T(x,t) 



ds = 2erfc 



(10.13) 


which is shown in Figure 10.1. 

Finally, if the initial condition is T(x, 0) = S(x), we can use the fact that (10.12) 
is also invariant under the two-parameter group 


x = e e x\, t = e 2e ti, T = e^Tf. 


The initial condition transforms to e M Ti(a:i,0) = e~ e 8(xi), since 6 (ax) = S(x)/a, 
so that the choice /j = — e makes both differential equation and initial condition 
invariant. As before, x/t 1 ^ 2 is an invariant, and now t x ^ 2 T is invariant as well. If 
we therefore look for a solution of the form T(x,t) = t~ 1 ^ 2 F(x/t 1 ^ 2 ), we find that 

DF vri + — ijF v + —F = 0. 

A suitable solution of this is Ae _1)2 ! AD (for example, using the method of Frobenius) , 
and hence 


T(x,t) = At~ 1/2 e~ x2/iDt . 


(10.14) 
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Fig. 10.1. The solution, (10.13), of the diffusion equation. 


To find the constant A, we can integrate (10.12) to obtain the integral conserva- 
tion law, 


d 

dt 


T dx = 0. 


Since 

/ OO nOO 

T(x,0)dx= / S(x) dx = 1, 

-oo J — OO 

we must have f^° T(x,t) dx = 1. Then, from (10.14), A = 1/y/ArrD, and hence 

™=7s br' 1,4 “ <mi5 > 

This is known as the point source solution of the diffusion equation. Notice the 
similarity between this solution and the sequence (5.27), which we can use to define 
the Dirac delta function, as shown in Figure 5.4. 

Further details of extensions to the basic method can be found in Bluman and 
Cole (1974) and Hydon (2000). We end this chapter with a recommendation. Given 
an ordinary or partial differential equation, try to find a simple invariant, such as 
translation, magnification or rotation. If you find one, this will lead to a reduction 
of order for an ordinary differential equation, or the transformation of a partial 
differential equation into an ordinary differential equation. If you cannot find an 
invariant by inspection, then you need to make a systematic search for invariants. 
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This can involve a lot of calculation, and is best done by a computer algebra package. 
Hydon (2000) gives a list and critique of some widely available packages. 


Exercises 

10.1 Show that each of the differential equations 


dy , ^ _ 4 f dy 


10.2 


(a) x Tx + y = x U 

(b) - yx m = 0, 
dx 

(c) (S) =y+x2 ’ 

is invariant under a group of transformations with infinitesimal generator 

r r, ^ , . dfi 

U(p= ax— + by — , 
ox ay 

and hence integrate each equation. 

Find the general differential equation of first order, invariant under the 
groups 


(a) U(j) = 


dcj) d(j> 
dx V dy' 


/i \ r T I d(j) 8(j) 

(b) U * = X di + ay lTy ' 

, . d(p d<j> 

(c) U,p = y di + W 

10.3 If the differential equation 


dx 


dy 


P{x,y) Q(x, y) 

is invariant under a group with infinitesimal generators £ and ij, show that 
it has a solution in integrating factor form, 


P dy — Q dx 

Py-QZ 


= constant. 


Hence find the solution of 


dy y A — 2 x 3 y 
dx 2 xy 3 — x 4 

10.4 Show that the differential equation 

§ + P (^ = 0 

is invariant under at most six one-parameter groups. For the case p(x) = 
x m , reduce the order of the equation. 
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10.5 


10.6 

10.7 

10.8 


Find the most general first order differential equation invariant under the 
group with infinitesimal generator 

U(j> = exp | J p(x)da:| 


Derive the integrating factor for y' + P{x)y = Q(x ) by group theoretical 
methods. 

Find a similarity reduction of the porous medium equation, u t = 
{uu x ) x . Can you find any solutions of the resulting differential equation? 
The equation = u xx — u p , with p € R. + , arises in mathematical models 
of tumour growth, and other areas. Find some invariants of this equation, 
and the corresponding ordinary differential equations. Can the point source 
problem be resolved for this sink-like equation? 



CHAPTER ELEVEN 


Asymptotic Methods: Basic Ideas 


The vast majority of differential equations that arise as models for real physical 
systems cannot be solved directly by analytical methods. Often, the only way 
to proceed is to use a computer to calculate an approximate, numerical solution. 
However, if one or more small, dimensionless parameters appear in the differen- 
tial equation, it may be possible to use an asymptotic method to obtain an 
approximate solution. Moreover, the presence of a small parameter often leads to 
a singular perturbation problem, which can be difficult, if not impossible, to 
solve numerically. 

Small, dimensionless parameters usually arise when one physical process occurs 
much more slowly than another, or when one geometrical length in the problem 
is much shorter than another. Examples occur in many different areas of applied 
mathematics, and we will meet several in Chapter 12. As we shall see, dimensionless 
parameters arise naturally when we use dimensionless variables, which we discussed 
at the beginning of Chapter 5. Some other examples are: 

- Waves on the surface of a body of fluid or an elastic solid, with amplitude a and 
wavelength A, are said to be of small amplitude ife = a/A<Cl. A simplification 
of the governing equations based on the fact that e <C 1 leads to a system of 
linear partial differential equations (see, for example, Billingham and King, 2001). 
This is an example of a regular perturbation problem, where the problem is 
simplified throughout the domain of solution. 

- In the high speed flow of a viscous fluid past a flat plate of length L, pressure 
changes due to accelerations are much greater than those due to viscous stresses, 
as expressed by Re = pUL/p 1. Here, Re is the Reynolds number, a dimen- 
sionless parameter that measures the ratio of acceleration to viscous forces, p is 
the fluid density, U the fluid velocity away from the plate and p the fluid vis- 
cosity. The solution for Re -1 <C 1 has the fluid velocity equal to U everywhere 
except for a small neighbourhood of the plate, known as a boundary layer, 
where viscosity becomes important (see, for example, Acheson, 1990). This is 
an example of a singular perturbation problem, where an apparently negligible 
physical effect, here viscosity, becomes important in a small region. 

- In aerodynamics, it is crucial to be able to calculate the lift force on the cross- 
section of a wing due to the flow of an inviscid fluid around it. These cross- 
sections are usually long and thin with aspect ratio e = l/L < Cl, where l is a 
typical vertical width, and L a typical horizontal length. Thin aerofoil theory 
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exploits the size of e to greatly simplify the calculation of the lift force (see, for 
example, Milne-Thompson, 1960, and Van Dyke, 1964). 


11.1 Asymptotic Expansions 

In this chapter, we will study how to approximate various integrals as series ex- 
pansions, whilst in Chapter 12, we will look at series solutions of differential equa- 
tions. These series are asymptotic expansions. The crucial difference between 
asymptotic expansions and the power series that we met earlier in the book is that 
asymptotic expansions need not be convergent in the usual sense. We can illustrate 
this using an example, but firstly, there is some useful notation for comparing the 
sizes of functions that we will introduce here and use extensively later. 


11.1.1 Gauge Functions 

(i) If 


lim 


m 

0 5(e) 


= A, 


for some nonzero constant A, we write /(e) = O (g (e)) for e <C 1. We say 
that / is of order g for small e. Here <?(e) is a gauge function, since it is 
used to gauge the size of /(e). For example, when eCl, 


sine = 0(e), cose = 0(1), e e = 0(1), cos e — 1 = 0(e 2 ), 


all of which can be found from the Taylor series expansions of these functions. 
This notation tells us nothing about the constant A. For example, 10 10 = 
0(1). The order notation only tells us how functions behave as e — * 0. It is 
not meant to be used for comparing constants, which are all, by definition, 
of 0(1). 

(ii) We also have a notation available that displays more information about the 
behaviour of the functions. If 


lim M = 1, 

5(e) 

we write /(e) ~ g(e), and say that /(e) is asymptotic to g(e) as e — > 0. For 
example, 

sine~e, 1 — cos e ~ , e c — 1 ~ e, as e — > 0. 


(mi if 


lim = 0, 

5(e) 


we write /(e) = o (<? (e)), and say that / is much less than g. For example 


sine = o(l), cos e = o(e 1 ), e e = o(loge) for e <C 1. 
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11.1.2 Example: Series Expansions of the Exponential Integral, 
Ei(» 

Let’s consider the exponential integral 

r°° e ~* 

Ei(a:) = / dt, (11.1) 

Jx t 

for x > 0. We can integrate by parts to obtain 
Ei(cc) = 

Doing this repeatedly leads to 

Ei(*) = e ~ x £ (-1 ) m ~ l + Rn, (11.2) 



where 


Rn 


(-i ) N m 




(11.3) 


This result is exact. Now, let’s consider how big Rn is for x 1. Using the fact 
that x~( N+r> > for t > x, 


and hence 


\Rn\ = N\ 



e-t 

t N +i 


dt ^ 


N\ 

x N+1 



N\e~ x 


\R N \ = O (e~ x x- {N+1 ^ forcOl. 


Therefore, if we truncate the series expansion (11.2) at a fixed value of N by 
neglecting Rn, it converges to Ei(x) as x — » oo. This is our first example of an 
asymptotic expansion. In common with most useful asymptotic expansions, 
(11.2) does not converge in the usual sense. The ratio of the (N + l) th and N th 
terms is 


(~1) N N\ x n N 

x N+1 {-1) N ~ 1 {N- 1)! T 


which is unbounded as N — * oo, so the series diverges, by the ratio test. How- 
ever, for x even moderately large, (11.2) provides an extremely efficient method of 
calculating Ei(a:), as we shall see. 

In order to develop the power series representation of Ei(a;) about x = 0, we 
would like to use the series expansion of e~* . However, term by term integration is 
not possible as things stand, since the integrals will not be convergent. Moreover, 
the integral for Ei(a;) does not converge as x — > 0. We therefore have to be a little 
more cunning in order to obtain our power series. Firstly, note that 


Ei(x) 





dt 


t( 1 + t) 



■ oo 


1 
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1 

t 



1 + t 


dt — log 



With this rearrangement, we can take the limit x —> 0 in the first integral, with 
the second integral giving Ei(x) ~ — logo; as x — > 0. A further rearrangement then 
gives 

Ei(l) = l \ ('" "" TTt) dt ~ f„ 1 < e '* - * 


1 t 


dt — log 


x 

1 + X 


( e '‘-TT() j a j(C-i)*- 

The first term is just a constant, which we could evaluate numerically, but can be 
shown to be equal to minus Euler’s constant, 7 = 0.5772 . . . , although we will not 
prove this here. We can now use the power series representation of e~ t to give 



Ei(x) 


—7 — log x 



ht 

n! 


dt. 


Since this power series converges uniformly for all t, we can interchange the order 
of summation and integration, and finally arrive at 


OO / \ fi 

Ei(z) = -7 - log x - ^2 n X n \ ■ ( n - 4 ) 

71=1 

This representation of Ei(a;) is convergent for all 2: >0. 

The philosophy that we have used earlier in the book is that we can now use (11.4) 
to calculate the function for any given x. There are, however, practical problems 
associated with this. Firstly, the series converges very slowly. For example, if we 
take x = 5 we need 20 terms of the series in order to get three-figure accuracy. 
Secondly, even if we take enough terms in the series to get an accurate answer, 
the result is the difference of many large terms. For example, the largest term 
in the series for Ei(20) is approximately 2.3 x 10 6 , whilst for Ei(40) this rises to 
approximately 3.8 x 10 14 . Unless the computer used to sum the series stores many 
significant figures, there will be a large roundoff error involved in calculating the 
Ei(x) as the small difference of many large terms. These difficulties do not arise for 
the asymptotic expansion, (11.2). Consider the two-term expansion 


Ei(x) ~ 



Figure 11.1 shows the relative error using this expansion compared with that using 
the first 25 terms of the convergent series representation, (11.4). The advantages 
of the asymptotic series over the convergent series are immediately apparent. Note 
that the convergent series (11.4) also provides an asymptotic expansion valid for 
iCl. We conclude that asymptotic expansions can also be convergent series. 
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Fig. 11.1. The relative error in calculating Ei(a:) using the first 25 terms of the convergent 
series expansion (11.4), solid line, and the two-term asymptotic expansion (11.2), dashed 
line. 


To summarize, let 

N 

Sn(x) = ^2 fn{x)- 

n—0 

- If Sn is a convergent series representation of S(x), Sn —> S(x) as N —> oo for 
fixed x within the radius of convergence. 

- If Sn is an asymptotic series representation of S( x), Sn{ x) ~ S(x) as x — > oo 
(or whatever limit is appropriate) for any fixed N ^ 0. 


11.1.3 Asymptotic Sequences of Gauge Functions 

Let 6 n (e) , n = 0,1,2,... be a sequence of functions such that 6 n = o(6 n - 1 ) for 
tCl. Such a sequence is called an asymptotic sequence of gauge functions. 
For example, 6 n = e", 6 n = e”/ 2 , 6 n = (cote) - ™, 6 n = (— loge) - ™. If we can write 

N 

/( e ) ~ 51 a «^( £ ) as e — > 0, 

71=0 

for some sequence of constants a n , we say that /(e) has an asymptotic expan- 
sion relative to the asymptotic sequence 6 n (e) for e <C 1. For example, Ei(x) has 
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an asymptotic expansion, (11.2), relative to the asymptotic sequence x~ n e~ x for 
x < 1. The asymptotic expansion of a function relative to a given sequence of 
gauge functions is unique, since each a n can be calculated in turn from 

/(e) f " _1 1 

o 0 = lim a n = lim < /(e) - V' a n S m (e) > /S n (e), 

^o^o(e) „^o J 

for 77 = 1,2,... . However, a given function can have different asymptotic expansions 
with respect to different sequences of gauge functions. For example, 


if S n (e) 


sin e = 


Em )” +1 

n— 1 


e 2n ~i 
(2n — 1)! ’ 


if S n (e) — (1 ~ e e )™> sin e — Sq + —Si + -62 + -<53 + • • • • 

Some sequences of gauge functions are clearly easier to use than others. We also 
note that different functions may have the same asymptotic expansion when we 
consider only a finite number of terms. For example, 

* e n 

e e ~ E — as e — > 0 + , for any TV > 0, 

' n! 

n= 0 


e e + e 1 / e ~ V y as e —> 0 + , for any N > 0. 

' n! 

n— 0 

Since e -1 / e is exponentially small, it will only appear in the asymptotic expansion 
after all of the algebraic terms. 

Now let’s consider functions of a single variable x and a parameter e, f(x;e). 
We can think of this as a typical solution of an ordinary differential equation with 
independent variable x and a small parameter, e. If / has an asymptotic expansion, 

N 

f(x',e) ~ ^2 fn(x)6 n (e) as e -> 0, 

n = 0 


that is valid for all x in some domain R , we say that the expansion is uniformly 
valid in R. If this is not the case, we say that the expansion becomes nonuniform 
in some subdomain. For example, 

sin(x + e) = sinx + ecosa: — ^ye 2 sinx — ^e 3 cosa; + 0(e 4 ) as e — > 0, 

for all x, so the expansion is uniformly valid. Now consider 

_/ e\i/ 2 f e e 2 ) 

+ a, e — . 0, 


provided x e. We say that the expansion becomes nonuniform as x — > 0, when 
x = O(e). Note that each successive term is smaller by a factor of e/x, and therefore 
the expansion fails to be asymptotic when x = 0(e). To determine an asymptotic 
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expansion valid when x = 0(e), we define a new, scaled variable, a procedure 
that we will use again and again later, given by x = eX, with X = 0(l)ase— >0. 
In terms of X, 

y/x + e = e^^y/X + 1. 

This is the trivial asymptotic expansion valid for X = 0(l)f . 


11.2 The Asymptotic Evaluation of Integrals 

We have already seen in Chapters 5 and 6 that the solution of a differential equation 
can often be found in closed form in terms of an integral. Instead of trying to 
develop the asymptotic solution of a differential equation as some parameter or 
variable becomes small, it is often more convenient to find an integral expression for 
the solution and then seek an asymptotic expansion of the integral. Let’s consider 
integrals of the type 

1(A) = [ e x ^g(t)dt, (11.5) 

Jc 

with / and g analytic functions of t, A real and positive and C a contour that 
joins two distinct points in the complex t-plane, one or both of which may be at 
infinity. Such integrals arise very commonly in this context. For example, the 
Fourier transform, which we studied in Chapter 5, takes this form with f(t) = it, 
whilst the Laplace inversion integral, which we discussed in Section 6.4, has f(t) = t. 

Another common example is the solution of Airy’s equation, y" — xy = 0, which 
we wrote in terms of Bessel functions in Section 3.8. The solution y = Ai(ai) can 
also be written in integral form as 

M(x) = f e xt ~^ 3 dt. (11.6) 

2m J c 

To see that this is a solution of Airy’s equation, note that 

Ai" — :cAi = — / (f 2 - x)e xt ~^ 3 dt = -J- [ (e xt ~^ 3 ) dt = 0, 

2m Jc ’ 2m J c dt \ J 

provided that e xt ~^ f3 — > 0 as |t| — » oo on the contour C. For \t\ ^$> 1, xt — ~ 

— ^t 3 , so we need Re(f 3 ) > 0 on the contour C, and therefore — f < arg (t) < 
|, \ < arg (f) < ^ or — | > arg (t) > — ^. As we shall see later, Ai(cc) is 
distinguished from Bi(a;) in that Ai(a;) — » 0 as x — > oo, in particular, with Ai(x) ~ 
a: _1 / 4 exp (— |x 3 / 2 ) /2 v / 7r. This means that an appropriate contour C originates 
with arg (i) = — ^ and terminates with arg(f) = (for example, the contour C in 
Figure 11.10). 

Rather than diving straight in and trying to evaluate I{ A) in (11.5), we can 
proceed by considering two simpler cases first. 


f and of course valid for all X in this simple example. 
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11.2.1 Laplace’s Method 

Consider the complex integral along the real line 

Ji(A) = / e x f^g(t)dt with /(f) real and A ;§> 1. (11.7) 

J a 

In this case, the integrand is largest at the point where /(f) is largest. In fact, 
we can approximate the integral by simply considering the contribution from the 
neighbourhood of the point where /(f) takes its maximum value. This is known as 

Laplace’s Method 

Integrals of the form (11.7) can be estimated by 
considering the contribution from the neighbour- 
hood of the maximum value of /(f) alone. 

We will see how we can justify this rigorously later. For the moment, let’s consider 
some examples. 


Example 1 

Consider the integral representation 


POO 

K v (x) = / e~ x cosh 4 cosh vt dt , 

J o 


of the modified Bessel function, K u (x). When x 1, this is in the form (11.7) 
with /(f) = — coshf and g(t) = coshz/f. The maximum value of /(f) = — coshf is 
at f = 0, where /(f) = g(t ) = 1. For f< 1, coshf ~ 1 + Af 2 , and we can therefore 
use Laplace’s method to approximate K„(x) as 


K v (x) ~ / e 

J o 


-^(i+it 2 ) dt = e -x 


dt. 


After making the substitution f = t\J 2/x, this becomes 


K v (x) 


—e 

x 


e dt = \ —e x , for x ^$> 1, 


using (3.5). 


Example 2 

Consider the definition, 

POO p OO 

r(l + A)= / f A e"*df= / e Alosi - t df, 

Jo Jo 

of the gamma function, which we met in Section 3.1. Can we find the asymptotic 
form of the gamma function when A 1? Since the definition is not quite in the 
form that we require for Laplace’s method, we need to make a transformation. If 
we let f = At, we find that 

POO 

r(i + a) = a 1+a / e A (i°g r-r) dl _ 

Jo 
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This is in the form of (11.7) with /(r) = logr — r. Since /'(r) = 1/r — 1 and 
/"(r) = — 1/r 2 < 0, / has a local maximum at r = 1. Laplace’s method states that 
T(l + A) is dominated by the contribution to the integral from the neighbourhood 
of r = 1. We use a Taylor series expansion, 

logr - t = log{l + (r — 1)} — 1 — (r — 1) = —1 — ~ (t - l) 2 + • • • for |r — 1| < 1, 
and extend the range of integration, to obtain 

/ OO 

e -|AT 2 dr as A — > oo. 

-OO 

If we now let T = T / A 1 / 2 , we find that 

/ OO 

e~i f2 dT, 

-OO 

and hence 

A! = T(1 + A) ~ v / 27rA A+ 5e -A as A — » oo. (11.8) 

This is known as Stirling’s formula, and provides an excellent approximation to 
the gamma function, for A > 2, as shown in Figure 11.2. 


Example 3 


Consider the integral 


m = [ 

J 0 


10 e~ xt 


dt. 


1 + t 

In this case, f = —t and g = 1/(1 + t). Since f(t) = -1 / 0 and the maximum 
value of / occurs at t = 0 for t, G [0, 10], 

/■10 


/(A) 


s ~ xt dt= \ (l-e- 10A ) ~ \ as A 
A A 


In fact we can use the binomial expansion 

(1 + t)- 1 = 1 -t+t 2 ~t 3 + ■■■ + (-1 ) n t n + • • • , 

even though this is only convergent for |t| < 1, since the integrand is exponentially 
small away from t = 0. We can also extend the range of integration to infinity 
without affecting the result, to give 

iw ~ t /”(- <# = E ( - 1) Z/: +1) = t ^ ^ oo, 

— n ^ 0 


n— 0 ^ u n = 0 

since, using the substitution r = At, 

1 


n — 0 


A n +! 


t n e~ xt dt = 


\ n +i 


T n e~ T dr = 


1 


A«+! 


r(n + l) = 


n\ 

A ra+1 ' 


J o /v jo 

Note that, as we found for Ei(a:), this is an asymptotic, rather than convergent, 
series representation of I{x). We can justify this procedure using the following 
lemma. 
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Fig. 11.2. The gamma function, T(1 + A) = A!, and its asymptotic approximation using 
Stirling’s formula, (11.8). 


Lemma 11.1 (Watson’s lemma) If 


/(A) = / e xt g{t)dt , 

Jo 

for A > 0 7 with g either bounded or having only integrable singularities, 

N 

g(t) ~ a n t an as t — > 0 ? 

n = 0 

and — 1 < «o < Oil < • • • < cxn (which ensures that e~ xt g(t) has at worst an 
integrable singularity at t = 0), then 


,oo v n 

/(A) ~ / e _At ^ a„t“ n dt = ^ a„A _an_1 r(a ri + 1) as A — > oo. 

"'° n=0 n—0 


Note that this lemma simply says that the integral is dominated by the contribu- 
tion in the neighbourhood of t = 0, so that we can replace g with its asymptotic 
expansion and extend the range of integration to infinity. 
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Proof We begin by separating oft the contribution to the integral from the neigh- 
bourhood of the origin by noting that 


N 


/(A) - a„A- a "- 1 r(a„ + 1) 


n — 0 


N 


/(A) - On / t° 


z~ xt dt 


n — 0 



p<5 poo N 


rA 


/ e~ xt g(t) dt — / e _At a n t an dt 

+ 

/ e~ Xt g(t) dt 


J 0 ' J 0 7i=0 


Js 


for any real 6 with 0 < 6 < A. Next, we make use of the asymptotic behaviour of 
g. Since 

N 

g(t) - ^2 a nt° ln < Kt an+1 

n— 0 

for some K > 0 when 0 < t < 8 and 6 is sufficiently small, we have 

N 

/(A)-^a„A-“"- 1 r(a„ + l) 

71=0 


< 


p6 N poo N 

/ e _At ^ a n t art dt + K / e~ xt t an+1 dt — e~ xt ^ a n t an dt 

-'° n—0 0 •'° n= o 


e At g(f) dt 


pOO cOO pOO 

-VaJ t an e~ xt dt + K / e~ xt t an+1 dt — K / e~ xt t a ^ 
JS J 0 J 6 

r A 

/ e~ xt g(t) dt 

Jb 

- tv POO 

<y^|a„| / t an e~ xt dt + /tTA - “ n+1 ~ 1 r(a„ + i + 1) 

n— 0 

pOO pA 

-hit' / e _A *t“ n+1 dt + / e _At |<?(t)| dt. 

J 8 J 8 

Now, since e _At < e _ ^ A_1 ^e _< for t > 6, 

pOO pOO 

/ t an e~ xt dt<e- {x ~ 1)6 t^e^dt <e-( x ~ 1)f T(a n + 1), 


dt 


and also 


J e At |g(f)| dt < Ge A5 , 
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for t G [<5, A], where G = f s \g(t) \ dt. This finally shows that 


N 


/( A)-^a„A-“’ , - 1 r(a„ + l) 


n = 0 


< K A Q!n+1 1 r(Q n _|_i + 1) + Ge 


-X 6 


N 


+e -(^ ) £ 


k n—0 


/»oo /*00 I 

j f" n e _ * dt + K J i an+1 e _t dt \ = 0( 


and hence the result is proved. 


□ 


A simple modification of this proof can be used to justify the use of Laplace’s 
method in general. 


11.2.2 The Method of Stationary Phase 

Let’s now consider an integral, again along the real line, of the form 

r b 

/ 2 (A) = / e lXF< - t ' > g(t) dt, with F(t) real and A » 1. (11.9) 

J a 

Integrals of this type arise when Fourier transforms are used to solve differential 
equations (see Chapter 5). Points where F'(t) = 0 are called points of stationary 
phase, and the integral can be evaluated by considering the sum of the contribu- 
tions from each of these points. We will consider this more carefully in the next 
section. For the moment, we can illustrate why this should be so by considering, 
as an example, the case F(t) = (1 — t) 2 , g{t) = t, which has F’{t) = 0 when t= 1, 
and 

g(t)e lXF ^ = tcos A(1 — t) 2 + if sinA(l — t ) 2 . 

The rapid oscillations of this integrand lead to almost complete cancellation, except 
in the neighbourhood of the point of stationary phase, as can be seen in Figure 11.3. 


Let’s consider the situation when there is a single point of stationary phase at 
t = to- Then, since A' (to) = 0, 


F(t) ~ F(t 0 ) + \(t- to) 2 F”{to) + O ((t - t 0 ) 3 ) for |t - t 0 | < 1, 


provided that F”(to) ^ 0 (see Exercise 11.11). If we assume that the integral is 
dominated by the contribution from the neighbourhood of the point of stationary 
phase, 


J 2 (A) 



g (to) exp 


i\{F{to) + -F"(to){t-to) 2 


dt 


= e lXF(to) g(to) ^ exp|ifAF"(t 0 )(t-to) 2 | 


dt 
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A = 5 



A = 20 




Fig. 11.3. The function tcos A(1 — t) 2 for A = 5, 10 and 20. 


for some S <C 1. If we now let 


T = (t — t 0 ) 


^A|F"(t 0 )|, 


this becomes 

h{\)~e iXF ^g{to) 


AIT 1 " (to) | J-iy/\X\F"{t 0 )\ 
and, at leading order as A — > oo, 


f>y/h\\F"(t 0 )\ 

e i sgn{F"(t 0 )}T 2 dT 


J 2 (A) 


e iXF(t °)g{to) 


A|^"(io)| 



dT. 


We now just need to calculate 


J = 



dT. 


To do this, consider the contours C \ , C 2 and C 3 in the complex T-plane, illustrated 
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in Figure 11.4. Considering the contour C 2 first, 


lm(T) 



Fig. 11.4. The contours Ci, C 2 and C 3 in the complex T-plane. 




rr/ 4 

[ e iT dT 

= 

/ e iR ( cos29+ls ™ 2(>) iRe i0 d6 

dc 2 


Jo 


P7t/4 


€ R 


— sin 26 


0 — 0 as R — » 00 , 


by Jordan’s lemma. On the remaining contours, 


and 




as R 


00 , 



iT z 


dT as R — > 00 . 


By Cauchy’s theorem 


AT “ 


' C'i+C , 2+C , 3 


dT = 0, 


and hence 
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Similarly, 

r e ~ iT 2 dT= ^ e --/ 4 . 

Jo 2 

We conclude that 


J 2 (A) 


e iAF( t o) eS gn(F"(i 0 )W4 ff ( to ) 


2n 


A|F"(io)| 


= 0(X 1 / 2 ) as A — > oo. 


In our example, F(t) = (1 — t) 2 , so that to 
g(t) = t, so that g(to) = 1. We conclude that 


te^-V dt 



= 1, F(to) = 0 and F"(to) = 2, and 


oo, when a < 1 < b. (11.10) 


If a = 1 or & = 1, we get half of the contribution from the stationary phase point. 
If a > 1 or b < 1, there are no points of stationary phase. In this case, we can, in 
general, integrate by parts to obtain 


J 2 (A) 


1 f ff(b) iXF(b) __ g( q ) iXF(a) 

A \ F'(b) F'{a) 



(11.11) 


Note that the contribution from the endpoints is of 0(A 1 ), whilst contributions 
from points of stationary phase are larger, of 0( A -1 / 2 ). 


Example 

Consider (5.40), the solution of an initial value problem for the wave equation 
that we considered in Section 5.5.2. Let’s analyze this solution when t > 1 at a 
point x = vt that moves with constant velocity, v. Since (5.40) is written in a 
form independent of the orientation of the coordinate axes, we can assume that 
v = u(0, 0, 1). Consider 


I+ = 




g{ k)e itF+(k) dk x dk y dk z , 


where 


ff( k ) = 16 ^^ cfc > F+(k) = c^J k 2 + kl + k 2 ,- vk z 

and k = (k x , k y , k z ). Starting with the k x integral, we can see that 

dF + ck x 



which is zero when k x 
and, noting that 


0. This is, therefore, a unique point of stationary phase 


d 2 F+ 



k x = 0 
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the method of stationary phase shows that 

1+ ~ e W4 ^| J°° j°° ( k 2 y + k 2 z ) 1/4 g(k y ,k z )e u ^ k y’ k ^ dk y dk z , 

where 

g(k y ,k z ) - K{k v , k z ) = c^J k% + k 2 z - vk z . 

Similarly, we can use the method of stationary phase on the k y integral, and arrive 
at 

1 f°° 

When v ^ c, a simple change of variable, k z = tk z , shows that 1+ = 0{l/t 2 ). 
However, when v = c, /+ is much larger, of 0(l/t). This is consistent with the fact 
that disturbances propagate at speed c. We therefore assume that v = c, so that 
we are considering a point moving in the ^-direction with constant speed c, which 
gives us 

I+ ~ 8 brVf /oo /( °’ °’ kz) dkz ' 

If we now define 

/ oo /*oo /»oo 

/ / g(k)e ltF ~^ dk x dkydk z , 

-oo J — oo J — oo 

with 

-Fh(k) = —c^k 2 + + fc 2 — 

and follow the analysis through in the same way, we find that 

1 r 00 

Since we have already decided to consider the case v = c, /_ = 0(l/t 2 ) for t 1, 
and therefore 

u(vf) ~ /+ ~ J /( 0, 0, fc z ) 

where v = c(0, 0, 1). Since the z-direction can be chosen arbitrarily in this problem, 
we conclude that the large time solution is small, of 0(l/f 2 ), except on the surface 
of a sphere of radius ct, where the solution is given by 

“ (c,e) ~ f_J(*e)ds = o (i) , (11.12) 

with e an arbitrary unit vector. The amplitude of the solution therefore decays like 
1 /t, and also depends upon the total frequency content of the initial conditions, 
(5.38), in the direction of e, as given by the integral of their Fourier transform, 

S-O o /( se ) ds ■ 
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In summary, 

The Method of Stationary Phase 

Integrals of the form (11.9) can be estimated by 
considering the contribution from the neighbour- 
hood of each point of stationary phase, where 
F'(t) = 0. In the absence of such points, the in- 
tegral is dominated by the contributions from the 
endpoints of the range of integration. 


11.2.3 The Method of Steepest Descents 

Let’s return to consider the more general case, 

/( A)= [ e XfW g{t)dt (11.13) 

Jc 

for A » 1, / and g analytic functions of t = x + iy and C a contour in the complex 
f-plane. Laplace’s method and the method of stationary phase are special cases 
of this. Since /(f) and g{t) are analytic, we can deform the contour C without 
changing /(A). If we deform C onto a contour C\, on which the imaginary part of 
/(f) is a constant, /(f) = + iip o, then (11.13) becomes 


/(A) = e iX ^° f g(t) dt, 

J Ci 


and we can simply use Laplace’s method. This is what we really want to do, because 
we have shown rigorously, in Lemma 11.1, that this sort of integral is dominated by 
the neighbourhood of the point on G\ where the real-valued function </>(f) is largest. 

In order to take this further, we need to know what the curves <p = constant 
and ip = constant look like for an analytic function /(f) = <p(x,y) + iip{x,y). The 
Cauchy-Riemann equations for the analytic function /(f) are (see Appendix 6) 

dcp dip d(j) dip 

dx dy dy dx' 

which show that 

y, = - — — + = 0 

dx dx dy dy dx dy dy dx 

and hence that V/ is perpendicular to X7ip. Recall that V<^ is normal to lines of 
constant </>, from which we conclude that the lines of constant <f> are perpendicular 
to the lines of constant ip. Moreover, (f> changes most rapidly in the direction of V/, 
in other words on the lines of constant ip. We say that the lines where ip is constant 
are contours of steepest descent and ascent for <p. Our strategy for evaluating 
the integral (11.13) is therefore to deform the contour of integration into one of 
steepest descent. Figure 11.5 illustrates these ideas for the function /(f) = f 2 . 
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lines of constant $ and v 



x 


Fig. 11.5. The real part, <p(x,y) = x 2 — y 2 , of /(f) = t 2 , and the lines of steepest ascent 
and descent for the analytic function /(f) = t 2 . 


When /(f) = t 2 , 


Example: f(t) = t 2 


^(A) = [ g(t)e xt2 dt, 

Jc 

with C a contour that joins two points, P and Q , in the complex f-plane. In this 
case, /(f) = f 2 = (x + iy ) 2 = x 2 — y 2 + 2 ixy, and hence <p = x 2 — y 2 and ip = 2 xy. 
Therefore, <j> is constant on the hyperbolas x 2 — y 2 = <f> and ip is constant on the 
hyperbolas xy = as shown in Figure 11.5. 


Case 1: ip(P) > 0, ip(Q) > 0 and (f>(P) ^ <p(Q) 

In this case, the ends of the contour C lie in the upper half plane, and without 
loss of generality, we take 4>(P) > 4>{Q)- We can deform C into the contour C\ + C 2 , 
with C\ the steepest descent contour on which ip = ip(P) and C/ the contour on 
which cj) = (p{Q), as shown in Figure 11.6. On Cj we can therefore make the change 
of variable 


f{t) = f(P ) - r, with 0 < r < <p{P) - <j>(Q). 


The real part of / varies from <p(P) to <f>(Q) as r varies from zero to <p(P) — <p(Q ), 
whilst ip = ip(P) is constant. Since dr/dt = 


[ g(t)e x M dt = -e x ^ [ 

Jc i Jo 


4>(P)-<t>{Q) 


A c - 

m 


At 


dr 


\f(p) 9(P) 1 
f'(P) A 


as A 


oo, 
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Fig. 11.6. The contours Ci and C 2 in case 1. 


using Watson’s lemma, provided that g(P) is bounded and nonzero. On C 2 , </> = 
4>{Q) and the imaginary part, ip, varies. We can write 




< GLe X4>{Q) < 


g(t)e xf(t) dt , 


since (j>(Q) < (p{P), where G = maxc 2 |g(t)| and L = length of C 2 . We conclude 
that 


/(A) 


x.f(p) 9{P ) 1 

f'(P) A 


as A 


00. 


Case 2: P ) > 0, ipiQ) > 0 an d ^{P) = ^(Q) 

In this case, we must deform C into Ci + C 2 + O 3 , with tp constant on C\ and C 3 
and 4> < <I>(P ) constant on C 3 , as shown in Figure 11.7. The exact choice of contour 
C 2 is not important, since, as in case 1 , its contribution is exponentially smaller 
than the contributions from C\ and C 3 . This is very similar to case 1, except that 
we now also have, using Laplace’s method, a contribution at leading order from C 3 . 
We find that 


/(A) ~ <[-e A/(p) 9<yP 1 + e xf(Q) ^ X - as A 
( ) l HP ) r(Q)l X 


• 00. 


Let’s consider a more interesting example. 
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Fig. 11.7. The contours Ci, Ci and C 3 in case 2. 


Example: The Bessel function of order zero with large argument 

We saw in Chapter 3 that the Bessel function J n {\) has an integral representation, 
(3.18). Setting n = 0 and making the change of variable t = sin 0 leads to 


Jo (A) 



e iXt 


dt. 


(11.14) 


This is of the form (11.13) with fit) = it , and hence <p = —y, ip = x. In this 
case, P = — 1 and Q = 1, so that <p(P) = <p{Q) = 0. The contours of steepest 
descent through P and Q are just the straight lines ip = x = —1 and ip = x = 1 
respectively. We therefore deform the contour C, which is the portion of the real 
axis — 1 ^ x ^ 1, into C\ + C 2 + C 3 , as shown in Figure 11.8. The contribution 
from C 2 , y = Y > 0, is exponentially small whatever the choice of Y (Y = 1 
in Figure 11.8), whilst the contributions from C\ and C 3 are dominated by the 
neighbourhoods of the endpoints on the real axis. On C\ we make the change of 
variable t = — 1 + iy, so that 


1 f e iXt , 1 f Y e~ iX e~ Xy , 

— / , at = — / — . -i ay 

7T J C\ V 7 ! — t 1 2 Jo \/2iy + y 2 


e lX i 


r\/2 i . 


j 1,/2 e Xy as A 


0 
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Fig. 11.8. The contours C 1 , C 2 and C 3 for evaluating the asymptotic behaviour of Jo (A). 


using Laplace’s method. By making the change of variable y = A y, we can write 
this integral in terms of a gamma function, T(l/2) = and arrive at 


Similarly, 


and hence 


1 r gi\t g— iX+in/4 

— / . dt ~ , as A — > 00 . 

7T J Cl vT^ ^2n\ 


1 f giX—i'K/A 


k Jc 3 Vl — t 2 \/2nX 


dt ~ — . as A — > 00 , 


JoW ~Jl^ COa (. X ~l) asA 


00 . 


So, is this the whole story? Let’s return to our original example, /(f) = t 2 . Since 
/'(f) = 2 1, the real and imaginary parts of / are stationary at t = 0. However, 
t = 0 is neither a local maximum nor a local minimum. It is a saddle point. In 
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fact, no analytic function can have a local maximum or minimum, since its real and 
imaginary parts are harmonic (see Section 7.1.4 and Appendix 6). The real part of 
f(t) = t 2 has a ridge along the line y = 0 and a valley along the line x = 0, as can 
be seen in Figure 11.5. In cases 1 and 2, which we studied above, we were able to 
deform the contour C within one of the valleys. 


Case 3: ip(P) > 0 and ip(Q) < 0 

In this case, P and Q lie in different valleys of the function f(t) = t 2 , and 
we must deform the contour so that it runs through the line of steepest descent 
at the saddle point, since this is the only line with ip constant that connects the 
two valleys, as shown in Figure 11.9. As usual, the integrals on C-i and C4 are 


lm(t) 



Fig. 11.9. The contours Ci, C 2 , C 3 , C4 and C 5 in case 3. 


exponentially small, and, using Laplace’s method as in cases 1 and 2, 


>c 1 


g(t)e M dt ~ — e 


\f(p) 9(P) 1 


f(p)y Jc 5 


| 1 . 


On the steepest descent contour O3, we can make the simple change of variable 
t = iy, to arrive at 



dt = i 



g{iy)e Xy2 


dy. 
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This integral is dominated by the behaviour close to the saddle point, y = 0, and, 
if ^(0) is bounded and nonzero, Laplace’s method shows that 


lc 3 


g(t)e M dt ~ —i 


»<»>/: 


O rjo. — 


dy = -ig{0)\l y 


(11.15) 


If cp{P ) ^ 0 and (f>(Q) ^ 0, this is the dominant contribution. 


Finally, what if the contour of integration, C , extends to infinity? For such an 
integral to converge, it must extend into the valleys of f(t) as |t| — > 00 . In our 
example, if we let P and Q head off to infinity along the two different valleys, the 
integral is dominated by the contribution from the steepest descent contour through 
the saddle point, given by (11.15). 

Example: The Airy function, Ai(x), for |ar| 1 
As we have seen, (11.6), 

Ai(x) = -J— [ e xt ~i t3 dt, (11.16) 

2m J c 

with arg(f) = ±4^ as |f| — > 00 . For x > 0 we can make the change of variable 
t = x 1 / 2 t and, since arg(t) = arg(r), deform back to the original contour to obtain 

Ai(ar) = — / e ^ 3/2 (^-|^ 3 ) dr. (11.17) 

2m J c 

This is now in the standard form, with /(r) = r — |r 3 . Since f'(r) = 1 — r 2 , 
there are saddle points at r = ±1. The contours of constant imaginary part, ip, 
are shown in Figure 11.10. We can now deform C into the steepest descent contour 
through t = —1, marked as C in Figure 11.10. As we have seen, the integral is 
dominated by the contribution from the neighbourhood of this saddle point. Since 
the contour C is vertical at this point, we write r = —1 + iy and use Laplace’s 
method to show that 

Ai(aO ~ -z — / e x ' (~ 2 / 3 ~ y )idy = ^ x~ l ^ A exp (— ‘^x 3 / 2 ') as x — » 00 . 

2m J_ 00 2sji r V 3 / 

(11.18) 

Similarly, when x < 0, we can make the transformation t = (— x) l ^ 2 T, so that 

Ai(aO = [ e (-) 3/2 (-T-|^) dT (11 . 19) 

2m J c 

In this case, f(T) = —T — |T 3 and f'(T) = — 1 — T 2 , so there are saddle points at 
T = ±i. The contours of constant imaginary part, ip, are shown in Figure 11.11. 
By deforming C into the contour C\ + C 2 + C 3 , we can see that the integral is 
dominated by the contributions from the two saddle points. The integral along C 2 
is exponentially smaller. In order to calculate the leading order contribution from 
Ci, note that this contour passes through T = —i in the direction of 1 + i. We 
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therefore make the substitution T = — i + (1 + i)s and use Laplace’s method to 
arrive at 


2 t n J Ci 2 tt * ' J _ 00 


-2(-*) 3/ V ds 


_ (-x) 1/4 (l + t) Fk § i( _ x) 3/ 


Similarly, 


{~x) 1/2 f (_ a , ) 3/2 ( -_ T _i T 3^ | _ (-x) 1/4 (-l + i) /tt ! i( _ x) 3/2 


— — — 2-4 / — e" 

2tu Jq 3 2ni y 2 


and hence 


("*) 1/4 g m/ 2 t-^3/2^7r 


_ — sin < - (— x r' 2 + — > as x — > — oo. 

7T 3 4 


(11.20) 


The transition from exponential decay as x — > oo to decaying oscillations as x — > 
— oo is shown in Figure 11.12. Also shown is the behaviour predicted by our analysis 
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x 


Fig. 11.11. Contours where the imaginary part of f(T) = — T — | T 3 is constant, along 
with a single contour, C 2 , where the real part of / is constant. 


for large \x\, which can be seen to be in excellent agreement with Ai(x), even for 
moderate values of \x\. 


To end this section, we give a justification for the success of the method of 
stationary phase. Consider the example that we looked at earlier, 

J 2 (A)= [ b dt, 

J a 


with a < 1 < b. Using Cauchy’s theorem on the analytic function we can 

deform the contour C, the real axis with a < x ^ 6, into Ci + C2 + C3 + C4 + C5, 
as shown in Figure 11.13. The same arguments as those that we used above show 
that the largest contribution comes from the neighbourhood of the saddle point on 
the steepest descent contour, C3. This is, however, just the neighbourhood of the 
single point of stationary phase, and, even though the contour is different, because 
the integrand is analytic, we obtain the same result, (11.10). 
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Exercises 

11.1 Calculate the order of magnitude of the following functions in terms of the 
simplest function of e possible, as e — > 0 + . (a) sinhe, (b) tane, 

(c) sinh(l/e), (d) e~ e , (e) cote, (f) log(l + e), (g) (1 — cos e)/(l + cos e), 
(h) exp {— cosh (1/e)}. 

11.2 Show that e -1 / e = o(e n ) as e — > 0 for all real n, provided that the complex 
variable e is restricted to a domain that you should determine. 

11.3 Consider the integral 

f x e * 

I{x) = e x j —dt as x — > oo. 

J i t 

By integrating by parts repeatedly, develop a series expansion of the form 
/( x) = - + ^ + ^ + 4- (1 + 1 + 2 + 3!) e 1 ~ x + • • • . 

X X z X A X 4 

By considering the error in terminating the expansion after the term in 
x~ n , show that the series is asymptotic as x — » oo. 

11.4 Show that the following are asymptotic sequences. 
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-2-101234 

x 

Fig. 11.13. The contours Ci, C 2 , C 3 , C 4 and C 5 for the method of stationary phase. 



— OO 
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can be approximated as v — > oo, with z = 0(1) and positive, using 


11.8 


You may assume that the integral is dominated by the contribution from 
the neighbourhood of the maximum value of vt — z cosh t. 

Use the method of stationary phase to show that 


1 + t 


-dt • 


^ aS A 


11.9 Use the method of stationary phase to show that 


Jo 

11.10 Show that as A — > oo 

r l+5i 

e iXt dt - 


tt/2 1 

e i\(2t-sin2t) dt ~ 

2 


1/3 


t/6 


as A 


a 16iA 


(a) 


(b) 


1 


A 32A 2 


r»l+5i 


e iXt dt ■ 


/ —5—i 


n p W 4 
A • 


11.11 Consider the integral (11.9) when F(t) has a single point of stationary 
phase at t = to with a < to < b, F"(f 0 ) = 0 and F"'(to) 7^ 0. Use the 
method of stationary phase to show that 


^ ' iAE(io)+in’sgn(F"'(to))/6p 


h ~ g g(t 0 ) e 


6 


Vs; \\\F'"(t 0 )\ 


1/3 


for A » 1. 


11.12 Consider the integral 


I(\) = j Q e x ^ +it )dt, 


where P and Q are points in the complex t plane. Sketch the contours of 
constant real and imaginary parts of \t 2 + it. Show that if P = — A and 

<9 = 2, 


m 


e (2+2i)A 

(2 + z)A 


as A — » 00. 


Show that if P = — \ + i and Q = 1 — 3i, 


/(A) ~ -i 


27re A 


as A — > 00. 


11.13 Show that Stirling’s formula for Y(z + 1) when z 1 holds for complex z 
provided \z\ 1 and — 7r < arg(z) < 7 r. (Hint: Let z = Re lB .) Determine 

the next term in the asymptotic expansion. 
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11.14 From the integral representation, 

/„(») = -[ e x cos 8 cos v6 d6 — sir L^ , e 
Wo Jo 

of the modified Bessel function of order v, show that 

e x 

I v {x) 


—x cosh t—vt 


dt , 


\J2-kx ’ 

as x — > oo for real x and fixed v. 

11.15 The parabolic cylinder functions are defined by 

1 


D ~2 m{x) — 


D_ 


(m — 1)! 

1 


xe 


— x 2 /4 


e~ s s m ~ 1 (x 2 + 2s) 


— m— 1/2 


ds, 


2m+l 


i r°° 

(®) = 7 TTT.'ce-* 2 / 4 / e-Sfi™- 1 (x 2 + 2s) 

(?n — 1)! J 0 


— m+1/2 


(m — 1)!* Jo 

for to a positive integer. Show that for real x and fixed to, 

D- 2m (x) ~ x~ 2m e~^l\ D_ 2m+1 ~ x~ 2m+l e~ x ^ A , 

as X — > OO. 


ds, 



CHAPTER TWELVE 


Asymptotic Methods: Differential Equations 


In this chapter we will apply the ideas that we developed in the previous chapter 
to the solution of ordinary and partial differential equations. 


12.1 An Instructive Analogy: Algebraic Equations 

Many of the essential ideas that we will need in order to solve differential equa- 
tions using asymptotic methods can be illustrated using algebraic equations. These 
are much more straightforward to deal with than differential equations, and the 
ideas that we will use are far more transparent. We will consider two illustrative 
examples. 


12.1.1 Example: A Regular Perturbation 

Consider the cubic equation 

a: 3 — a: + e = 0. (12.1) 

Although there is an explicit formula for the solution of cubic equations, it is rather 
cumbersome to use. Let’s suppose instead that we only need to find the solutions 
for e <C 1. If we simply set e = 0, we get a: 3 — x — 0, and hence x = —1, 0 or 1. 
These are called the leading order solutions of the equation. These solutions 
are obviously not exact when e is small but nonzero, so let’s try to improve the 
accuracy of our approximation by seeking an asymptotic expansion of the solution 
(or more succinctly, an asymptotic solution) of the form 

x — Xq + eaq -t- € 2 x 2 + 0(e 3 ). (12.2) 

We can now substitute this into (12.1) and equate powers of e. This gives us 

( x 0 + exi + e 2 x 2 ) 3 — (x 0 + ex\ + e 2 x 2 ) + e + O (e 3 ) = 0, 

which we can rearrange into a hierarchy of powers of e in the form 

{a;o - cco} + e{(3a;o — 1) Xi + 1} + e 2 {3x 0 xl + (3xq - l) a; 2 } + O (e 3 ) = 0. 

At leading order we obviously get Xq — Xo = 0, and hence xo = — 1, 0 or 1. We will 
concentrate on the solution with Xq = 1. At O(e), (3a:o — 1) x\ + 1 = 2x\ + 1 = 0, 
and hence X\ = — 3 . At 0(e 2 ), 2>Xqx\ + (3xq — l) x 2 = | + 2x 2 = 0, and hence 
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X 2 = — §• We could, of course, continue to higher order if necessary. This shows 
that 

x = 1 - \e - |e 2 + 0(e 3 ) for e < 1 . 

2 8 

Similar expansions can be found for the other two solutions of (12.1). This is a 
regular perturbation problem, since we have found asymptotic expansions for all 
three roots of the cubic equation using the simple expansion (12.2). Figure 12.1 
shows that the function a: 3 — x + e is qualitatively similar for e = 0 and 0 < e € 1. 



Fig. 12.1. The function x 3 — x + e for e = 0, solid line, and e = 0.1, broken line. 


12.1.2 Example: A Singular Perturbation 

Consider the cubic equation 

ea; 3 + x 2 - 1 = 0. (12.3) 

At leading order for e <C 1, x 2 — 1 = 0, and hence x = ±1. However, we know 
that a cubic equation is meant to have three solutions. What’s happened to the 
other solution? This is an example of a singular perturbation problem, where the 
solution for e = 0 is qualitatively different to the solution when 0 < e « 1. The 
equation changes from quadratic to cubic, and the number of solutions goes from 
two to three. The key point is that we have implicitly assumed that x = 0(1). 
However, the term ex 3 , which we neglect at leading order when x = 0(1), becomes 
comparable to the term x 2 for sufficiently large x, specifically when x = 0(e ~ 3 ). 
Figure 12.2 shows how the function ex 3 + x 2 — 1 changes qualitatively for e = 0 and 
0 < e < 1. 
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Fig. 12.2. The function ex 3 + x 2 — 1 for e = 0, solid line, and e = 0.1, broken line. 


So, how can we proceed in a systematic way? If we expand x — x$ + exi + 0(e 2 ), 
we can construct the two 0(1) solutions, x = ±1 + O(e), in the same manner as we 
did for the previous example. Since we know that there must be three solutions, we 
conclude that the other solution cannot have x = 0(1), and assume that x = 0(e Q ), 
with a to be determined. If we define a scaled variable, x = e“X, with X = 0(1) 
for e€l, (12.3) becomes 


e 3a+l X 3 + e 2a x 2 _ j = q ( 12 .4) 

We must choose a in order to obtain an asymptotic balance between two of 
the terms in (12.4). If a > 0, the first two terms are small and cannot balance 
the third term, which is of 0(1). If a < 0, the first two terms are large, and we 
can choose a so that they are of the same asymptotic order. This requires that 
e 3 “ + iX 3 = 0(e 2a X 2 ), and hence e 3a+1 = 0(e 2a ). This gives 3a + 1 = 2a, and 
hence a = —1. This means that x = e -1 X = 0(e -1 ), as expected. Equation (12.4) 
now becomes 


X 3 + X 2 - e 2 = 0. (12.5) 

Since only e 2 and not e appears in this rescaled equation, we expand X = Xo + 
e 2 Xi + 0(e 4 ). At leading order, Xq + Xq = 0, and hence X 0 = —1 or 0. Of course, 
Xo = 0 will just give us the two solutions with x = 0(1) that we have already 
considered, so we take X 0 = — 1. At 0(e 2 ), 


(-1 + e 2 Xi) 3 + (-1 + e 2 Xi) 2 - e 2 + 0(e 4 ) = 0 
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gives 

-1 + 3e 2 X 1 + 1 - 2e 2 X 1 - e 2 + 0(e 4 ) = 0, 
and hence X\ = 1. Therefore X = — l + e 2 + 0(e 4 ), and hence x = — l/e + e + 0(e 3 ). 

12.2 Ordinary Differential Equations 

The solution of ordinary differential equations by asymptotic methods often pro- 
ceeds in a similar way to the solution of algebraic equations, which we discussed 
in the previous section. We assume that an asymptotic expansion of the solution 
exists, substitute into the equation and boundary conditions, and equate powers of 
the small parameter. This determines a sequence of simpler equations and bound- 
ary conditions that we can solve. In order to introduce the main ideas, we will 
begin by considering some simple, constant coefficient linear ordinary differential 
equations before moving on to study both nonlinear ordinary differential equations 
and some simple partial differential equations. 


12.2.1 Regular Perturbations 

Consider the ordinary differential equation 

y" + 2 ey' - y = 0, (12.6) 

to be solved for 0 < x < 1, subject to the boundary conditions 

2/(0) = 0, tf(l) = 1. (12.7) 


Of course, we could solve this constant coefficient ordinary differential equation 
analytically using the method described in Appendix 5, but it is instructive to try 
to construct the asymptotic solution when eCl. We seek a solution of the form 

2/0*0 = 2 /o (x) + £ 2 /i Or) + 0(e 2 ). 


At leading order, t/o — y 0 = 0, subject to yo(0) 
solution 


2/o 0*0 


sinh x 
sinh 1 


0 and 2/0 (1) = 1, which has 


If we now substitute the expansion for y into (12.6) and retain terms up to O(e), 
we obtain 


(*/o + £2/i) ,/ + 2e(2/o + eyi) 1 — ( 2/0 + £2/i) + 0(e 2 ) 


and hence 


= 2/o + 6 2/i + 2 «/o - 2/0 - £2/i + 0(e 2 ) = 0, 

„ „ / „ cosh x 

2/i - 2/1 = 2 j/q = -2 


sinh 1 

Similarly, the boundary conditions (12.7) show that 

2/o(0) + £ 2 /i (0) + 0(e 2 ) = 0, 2/o(l) + £ 2/1 (1) + 0(e 2 ) = 1, 


(12.8) 



12.2 ORDINARY DIFFERENTIAL EQUATIONS 


307 


and hence 


2/! (0) = 0, yi(0) = 0. 


(12.9) 


By seeking a particular integral solution of (12.8) in the form y i p = kx sinli x, and 
using the constants in the homogeneous solution, yih = A sinli x + 2? cosh x, to 
satisfy the boundary conditions (12.9), we arrive at 


yi = (i - x) 


sinh x 
cosh 1 ’ 


and hence 


. . sinh x . . sinh x 

^ + E(1 - 1 W 


(12.10) 


for e <C 1. The ratio of the second to the first term in this expansion is e(l — 
x) tanli 1, which is uniformly small for 0 ^ x ^ 1. This leads us to believe that 
the asymptotic solution (12.10) is uniformly valid in the region of solution. The 
situation is analogous to the example that we studied in Section 12.1.1. 

One subtle point is that, although we believe that the next term in the asymptotic 
expansion of the solution, which we write as 0(e 2 ) in (12.10), is uniformly smaller 
than the two retained terms for e sufficiently small, we have not proved this. We 
do not have a rigorous estimate for the size of the neglected term in the way that 
we did when we considered the exponential integral, where we were able to find an 
upper bound for Rn, given by (11.3). Although, for this simple, constant coefficient 
ordinary differential equation, we could write down the exact solution and prove 
that the remainder is of 0(e 2 ), in general, and in particular for nonlinear problems, 
this is not possible, and an act of faith is involved in trusting that our asymptotic 
solution provides a valid representation of the exact solution. This faith can be 
bolstered in a number of ways, for example, by comparing asymptotic solutions with 
numerical solutions, and by checking that the asymptotic solution makes sense in 
terms of the underlying physics of the system that we are modelling. The sensible 
applied mathematician always has a small, nagging doubt at the back of their mind 
about the validity of an asymptotic solution. For (12.6), our faith is justified, as 
can be seen in Figure 12.3. 


12.2.2 The Method of Matched Asymptotic Expansions 

Consider the ordinary differential equation 

ey" + 2y'-y=0, (12.11) 

to be solved for 0 < x < 1, subject to the boundary conditions 

2 /( 0 ) = 0 , 1 /( 1 ) = 1 . ( 12 . 12 ) 

The observant reader will notice that this is the same as the previous example, but 
with the small parameter e moved to the highest derivative term. We again seek 
an asymptotic solution of the form 

y{x) = J/o 0*0 + £2/i (x) + 0(e 2 ). 
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Fig. 12.3. Exact and asymptotic solutions of (12.6) when e = 0.25. 


At leading order, 2y' 0 — yQ = 0, which has the solution yo = Ae x / 2 for some constant 
A. However, the boundary conditions require that y o(0) = 0 and f/o(l) = 1- How 
can we satisfy two boundary conditions using only one constant? Well, of course 
we can’t. The problem is that for e = 0, the equation is of first order, and there- 
fore qualitatively different from the full, second order equation. This is a singular 
perturbation, and is analogous to the example we studied in Section 12.1.2. 

Let’s proceed by satisfying the boundary condition yo(l) = 1, which gives 

y 0 (x) = e^- 1 )/ 2 . 


At 0(e) we have 


2y[-y 1 = -y'' = - 1 -e^ 12 , 

to be solved subject to yi(0) = 0 and yi(l) = 0. This equation can be solved using 
an integrating factor (see Section A5.2), which gives 

Vl (x) = - iae^- 1 )/ 2 + jfceC*" 1 )/ 2 , 

for some constant k. Again, we cannot satisfy both boundary conditions, and we 
just use yi (1) = 0, which gives 


yi{x) = \( i^)e ( “- 1)/2 
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Finally, this gives 

y = e (*- 1 )/ 2 |l+ 1(1 — *)e} + 0(e 2 ), 


(12.13) 


for e <C 1. This suggests that y — > e _1 / 2 (l + |e) as x — ■> 0, which clearly does not 
satisfy the boundary condition y(Q) = 0. We must therefore introduce a boundary 
layer at x = 0, across which y adjusts to satisfy the boundary condition. The idea 
is that, in some small neighbourhood of x = 0, the term ey", which we neglected 
at leading order, becomes important because y varies rapidly. 

We rescale by defining x = e a X, with a > 0 (so that id) and X = 0(1) as 
e — > 0, and write y(x) = Y(X) for X = 0(1). Substituting this into (12.11) gives 


A 2 y 

dX 2 


2e~ 


t dY 

dX 


- Y = 0. 


Since a > 0, the second term in this equation is large, and to obtain an asymptotic 
balance at leading order we must have e 1 ” 2 " = 0(e _ “), which means that 1 — 2a = 
—a, and hence a = 1. So x = eX , 


d 2 Y dY 
dX 2+2 dX 


-eY = 0, 


(12.14) 


and Y(0) = 0. It is usual to refer to the region where e <C x ^ 1 as the outer 
region, with outer solution y(x), and the boundary layer region where x = O(e) 
as the inner region with inner solution Y(X). The other boundary condition 
is to be applied at x = 1. However, x = 1 does not lie in the inner region, where 
x = O(e). In order to fix a second boundary condition for (12.14), we will have to 
make sure that the solution in the inner region is consistent, in a sense that we will 
make clear below, with the solution in the outer region, which does satisfy y(l) = 1. 

We now expand 


Y(X) = Y 0 (X) + eY 1 (X) + 0(e 2 ). 


At leading order, Y 0 " + 2Y 0 ' = 0, to be solved subject to Tq( 0) = 0. The solution is 


Y 0 = A(l-e~ 2X ), 


for some constant A. At leading order, we now know that 

y ~ e^ -1) / 2 for e <C x < 1 (the outer expansion), 

Y ~ A{ 1 — e~ 2X ) for X = 0(1), x = O(e) (the inner expansion). 


For these two expansions to be consistent with each other, we must have 


lim Y(X) = lim iy(a:), (12.15) 

X — »oo x — ►O 

which gives A = e -1 / 2 . We will see below, where we make this vague notion of 
“consistency” more precise, that this is correct. 

At O(e) we obtain the equation for Yi(X) as 

y" + 2 Y; = Y 0 = A{ 1 - e -2 ' Y ). 
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Integrating this once gives 

Y; + 2Y 1 = A^X+^e~ 2X ^ + Cl , 

for some constant C\. This can now be solved using an integrating factor, and the 
solution that satisfies Yi (0) = 0 is 

Y! = \AXQ- + e~ 2X ) - c 2 ( 1 - e~ 2X ), 

for some constant c 2 , which we need to determine. To summarize, the two-term 
asymptotic expansions are 

y ~ e {x ~ 1)/2 + ^e(l - x)e {x ~ 1)/2 for e < x < 1, 

Y ~ A(1 - e~ 2X ) + e | \aX{1 + e~ 2X ) - c 2 ( 1 - e" 2X ) J for X = 0(1), x = 0(e). 

We can determine the constants A and c 2 by forcing the two expansions to be 
consistent in the sense that they should be equal in an intermediate region or 
overlap region, where e « x « 1. The point is that in such a region we expect 
both expansions to be valid. 

We define x = e^X with 0 < (3 < 1, and write y = Y{X). In terms of the 
intermediate variable, X, the outer expansion becomes 

Y ~ e -1 / 2 exp ^-e^X^J + -e(l — e /3 X)e^ 1 ^ 2 exp . 

When X = 0(1), we can expand the exponentials as Taylor series, and find that 

+ ^e" 1/2 e + o(e). (12.16) 

Since x = eX = e@X gives X = e l3 ~ x X, the inner expansion is 

Y ~ A |l - exp ^—2e^~ 1 x'j J 

+e ^ J 4e /3 - 1 x{l + exp(-2e /3 - 1 l)}-c 2 {l-exp(-2e /3 - 1 i')} . 

Now, since exp(— 2e^ _1 X) = o(e”) for all n > 0 (it is exponentially small for (3 < 1), 
we have 

Y = A+^Ae 0 X -c 2 e + o{e). (12.17) 

Since (12.16) and (12.17) must be identical, we need A = e -1 / 2 , consistent with 
the crude assumption, (12.15), that we made above, and also c 2 = —\e~ x ! 2 . This 
process, whereby we make the inner and outer expansions consistent, is known 
as asymptotic matching, and the inner and outer expansions are known as 
matched asymptotic expansions. A comparison between the one-term inner 
and outer solutions and the exact solution is given in Figure 12.4. It should be 
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clear that the inner expansion is a poor approximation in the outer region and 
vice versa. A little later, we will show how to construct, using the inner and outer 
expansions, a composite expansion that is uniformly valid for 0 ^ x ^ 1. 



Fig. 12.4. Exact and asymptotic solutions of (12.11) when t = 0.25. 


Van Dyke’s Matching Principle 

The use of an intermediate variable in an overlap region can get very tedious in 
more complicated problems. A method that works most, but not all, of the time, 
and is much easier to use, is Van Dyke’s matching principle. This principle is 
much easier to use than to explain, but let’s start with the explanation. 

Let 

N 

V 0 ) ~ 

n — 0 

be the outer expansion and 


N 

Y(X)~Y,^n(e)Y n (X) 


n — 0 


be the inner expansion with respect to the asymptotic sequences </> n (e) and ^ n (e), 
with x = f(e)X. In order to analyze how the outer expansion behaves in the inner 
region, we can write y{ x) in terms of X = x/f(e), and retain M terms in the 
resulting asymptotic expansion. We denote this by y( N ’ M \ the M th order inner 
approximation of the outer expansion. Similarly, we can write Y ( X ) in terms of x, 
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and retain M terms in the resulting expansion. We denote this by Y^ N,M \ the M th 
order outer approximation of the inner expansion. Van Dyke’s matching principle 
states that y( N < M '> = y( M,N \ Let’s see how this works for our example. 

In terms of the outer variable, the inner expansion is 

Y(X) ~ A( 1 - e~ 2x/e ) + e j (l + e ~ 2x/ - c 2 (l - e ~ 2x/ | 

~ v (2 ’ 2) = A + i Ax - c 2 e, 

for x = 0(1). In terms of the inner variable, the outer expansion is 

y(x) ~ exp + ^eX^j + ^e(l -eX)exp + ^eX^j 

for X = 0(1). In terms of the outer variable, 

„(«= e -i/^ 1+ i I+ i e y 

Van Dyke’s matching principle states that Y^ 2 ’ 2 ) = y( 2 > 2 ), and therefore gives A = 
e^ 1 / 2 and c 2 = — |e -1 / 2 rather more painlessly than before. 

Composite Expansions 

Although we now know how to construct inner and outer solutions, valid in the 
inner and outer regions, it would be more useful to have an asymptotic solution 
valid uniformly across the whole domain of solution, 0 ^ x ^ 1 in the example. We 
can construct such a uniformly valid composite expansion by using the inner 
and outer expansions. We simply need to add the two expansions and subtract 
the expression that gives their overlap. The overlap is just the expression that 
appears to the appropriate order in the intermediate region, (12.17) or (12.16), 
or equivalently the matched part, ?/ 2,2 ) or W 2 ’ 2 ). For our example problem, the 
one-term composite expansion is 

y ~ yo + Y 0 - y™ = e ( x ~ l )l 2 + e -i/ 2 (1 _ e - 2 X) _ e -i /2 

= e^ x ~^/ 2 — e -1 / 2 ~ 2a; / e for 0 < x < 1 as e — > 0. 

This composite expansion is shown in Figure 12.4, and shows good agreement with 
the exact solution across the whole domain, as expected. Note that, in terms of 
Van Dyke’s matching principle, we can write the composite solution of any order 
as 

M N 

V ~ #<"■"> = £ #»(*) + £ V„(X) - „<"■»>. 

n— 0 n— 0 



12.2 ORDINARY DIFFERENTIAL EQUATIONS 


313 


The Location of the Boundary Layer 

In our example, when we constructed the outer solution, we chose to satisfy the 
boundary condition at x = 1 and assume that there was a boundary layer at x = 0. 
Why is this correct? Let’s see what happens if we assume that there is a boundary 
layer at x = Xq. Strictly speaking, if xo ^ 0 and xo yf 1 this is an interior layer. 
We define scaled variables y( x) = Y(X) and x = xo + e a X, with a > 0 and Y, 
X = 0(1) for e <C 1. As before, we find that we can only obtain an asymptotic 
balance at leading order by taking a = 1, so that x = xo + eX and 

Y" + 2Y' - e = 0. 

At leading order, as before, Y 0 = A 0 + B 0 e~ 2X . As X — » — 00 , Y 0 becomes expo- 
nentially large, and cannot be matched to the outer solution. This forces us to take 
xq = 0, since then this solution is only needed for X ^ 0, and, as we have seen, we 
can construct an asymptotic solution. 

Interior Layers 

Singular perturbations of ordinary differential equations need not always result in 
a boundary layer. As an example, consider 

ey" + 2xy' + 2x = 0, (12.18) 

to be solved for — 1 < x < 1, subject to the boundary conditions 

y(- 1) = 2, 2 /( 1 ) = 3. (12.19) 

We will try to construct the leading order solution for e <C 1. The leading order 
outer solution satisfies 2x(y' + 1) = 0, and hence y = k — x for some constant k. If 
2 /( — 1) = 2 we need y = 1—x, whilst if y{ 1) = 3 we need y = 4—x. We clearly cannot 
satisfy both boundary conditions with the same outer solution, so let’s look for a 
boundary or interior layer at x = Xo by defining y(x) = Y{X) and x = xo + e a X, 
with Y, X = 0(1). In terms of these scaled variables, (12.18) becomes 

e^Yxx + 2(x 0 + e a X)(e~ a Y x + 1) = 0. 

If Xq ^ 0, for a leading order balance we need e 1 " 2 " = 0(e _ “), and hence a = 1. 
In this case, at leading order, 

Yxx + 2 xqY x = 0, 

and hence Y = A + Be~ 2x ° x . For xo > 0 this grows exponentially as X — > — 00 , 
whilst for xq < 0 this grows exponentially as X — > 00 . In either case, we cannot 
match these exponentially growing terms with the outer solution. This suggests 
that we need £0 = 0, when 

e x ~ 2a Y x x + 2 XY x + 2e a X = 0. 

For a leading order asymptotic balance we need a = 1/2, and hence a boundary 
layer with width of 0(e 1//2 ). At leading order, 


1A. y + 2XY x = 0 



314 


ASYMPTOTIC METHODS: DIFFERENTIAL EQUATIONS 


which, after multiplying by the integrating factor, e A , gives 


d 

dX 



= 0, 


and hence 


rX 


Y = B + A 


ds. 


Now, since the interior layer is at x = 0, the outer solution must be 

_ ( 1 — x for — 1 < x < 0(e 1 ^ 2 ), 

^ \ 4 — x for 0(e 1//2 ) < x < 1. 


Since y — > 4 as x — > 0 + and y — > 1 as x — » 0 , we must have Y — > 4 as X — > oo and 
Y — > 1 as X — > —oo. This allows us to fix the constants A and B and find that 

o /*X -| 

Y{X) = 1+-=/ e~ s2 ds = - (5 + 3 erf (x)) , 

V 7r J-oo 2 


which leads to the one-term composite solution 


V 


1 

y c = -x+ - 


^5 + 3 erf 



for —1 ^ x ^ 1 and e<l. 


(12.20) 


This is illustrated in Figure 12.5 for various values of e. Note that the boundary 
conditions at x = ±1 are only satisfied by the composite expansion at leading order. 


12.2.3 Nonlinear Problems 

Example 1 : Elliptic functions of large period 

As we have already seen in Section 9.4, the Jacobian elliptic function x = sn(f; k) 
satisfies the equation 

^ = y/l-x 2 y/l-k 2 x 2 , (12.21) 

subject to x = 0 when t = 0, and has periodic solutions for k yf 1. When k = 1, 
the solution that corresponds to sn(t; k) is a heteroclinic path that connects the 
equilibrium points (±1,0,0) in the phase portrait shown in Figure 9.18, and hence 
the period tends to infinity as k — > 1. When k is close to unity, it seems reasonable 
to assume that the period of the solution is large but finite. Can we quantify this? 
Let’s assume that k 2 = 1 — 6 2 , with 6 <C 1, and seek an asymptotic solution for 
the first quarter period of x(t), with 0 ^ x ^ 1. Figure 12.6 shows sn(f; k) for 
various values of 6, and we can see that the period does increase as 6 decreases and 
k approaches unity. The function sn(f; k) is available in MATLAB as ellipj. The 
quarter period is simply the value of t when sn (£; k) reaches its first maximum. 

We seek an asymptotic solution of the form 


x — x 0 + 6 2 x i + S 4 x 2 + O (<S 6 ) . 
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Fig. 12.5. The composite asymptotic solution, (12.20), of (12.18). 


Using a binomial expansion, (12.21) is 


dx 

dt 


(l - x 2 ) ( 1 + 6 


1 — x‘ 


1/2 = 1 - 2+ ^ 2 + ^ + ^)' 


This binomial expansion is only valid when x is not too close unity, so we should 
expect any asymptotic expansion that we develop to become nonuniform as x — > 1, 
and we treat this as the outer expansion. 

At leading order, 

= 1 — Xq, subject to xo(0) = 0, 
which has solution xo = tanht. At 0(6 2 ), 

= —2xoXi + -Xq = — 2tanhtxi + - tanh 2 1, subject to #i(0) = 0. 

Using the integrating factor cosh 2 t, we can find the solution 

xi = | (tanht — tsech 2 t) . 


1 - 2e" 2t + \b 2 + 0(<5 4 ) as t — » oo. 


We can now see that 
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5 = 0.1 


5 = 0.01 




5 = 0.001 5 = 0.0001 




Fig. 12.6. The Jacobian elliptic function sn (t;k) for various values of S. 


Although x approaches unity as t — > oo, there is no nonuniformity in this expansion, 
so we need to go to 0(<5 4 ). At this order, 

dx 2 2 1 f x n \ i . . . 

— — = — x-, — 2 xqX 2 + xqXi subject to x 2 (0) = 0. 

dt 8 \ 1 — Xq J 

Solving this problem would be extremely tedious. Fortunately, we don’t really want 
to know the exact expression for X ' 2 , just its behaviour as t — > oo. Using the known 
behaviour of the various hyperbolic functions, we find that 


dx 2 
dt 


+ 2X2 



as t 


00 , 


and hence from solving this simple linear equation, 


x 2 



as t 


00 . 


This shows that 

x ~ 1 — 2e~ 24 + -<5 2 — ^ 4 e 24 + 0(6 6 ) as t — > oo. (12.22) 

4 128 

We can now see that the fourth term in this expansion becomes comparable to the 
third when <5 4 e 24 = 0(6 2 ), and hence as t — > log(l/^), when x = 1 + 0(6 2 ). 
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We therefore define new variables for an inner region close to x = 1 as 
x = l- 8 2 X, t = log + T. 

On substituting these inner variables into (12.21), we find that, at leading order, 

dX 
~dT 

Using the substitution X = X — | brings this separable equation into a standard 
form, and the solution is 

X = i {cosh (. K - 2 T) - 1} . (12.23) 

We now need to determine the constant K by matching the inner solution, 

(12.23), with the outer solution, whose behaviour as x — > 1 is given by (12.22). 

Writing the inner expansion in terms of the outer variables and retaining terms up 
to 0(8 2 ) leads to 

x = l ~le K e~ 2t + \6 2 + 0(8 4 ), 
o 4 

for t = 0(1). Comparing this with (12.22) shows that we need ge^e^ 24 = 2e~ 24 , 
and hence K = 4 log 2, which gives 

x = 1 - i<5 2 {cosh (4 log 2 - 2T) - 1} + 0(S 4 ) (12.24) 

when T = 0(1). From this leading order approximation, x = 1 when T = Tq = 

2 log 2 + 0(<5 2 ). This is the quarter period of the solution, so the period r satisfies 

ir = log (i)+T„, 

and hence 

r = 4 log + 0(<5 2 ), 

for 8 <C 1. We conclude that the period of the solution does indeed tend to infinity 
as <5— > 0, /e — »1“, but only logarithmically fast. Figure 12.7 shows a comparison 
between the exact and analytical solutions. The agreement is very good for all 
8 < 1. We produced this figure using the MATLAB script 

/^Texact = [] ; d = 10 . ~ (-7 : 0 . 25 : 0) ; Tasymp = 4*log(4./d); 
options = optimset ( ’Display’ , 1 off ’TolX’ , 10~ — 10) ; 
for del = d 

k = sqrt (l-del~2) ; T2 = 2*log(4/del) ; 

Texact = [Texact 2*f zero (@ellipj ,T2, options, k)] ; 

end 

plot (loglO(d) , Texact, loglO(d) , Tasymp, ’ — ’ ) 
xlabel( ’ log_{10}\delta’ ) , ylabel ( ’T’ ) 

V legend( ’ exact ’, 1 asymptotic 1 ) J 




318 


ASYMPTOTIC METHODS: DIFFERENTIAL EQUATIONS 


This uses the MATLAB function fzero to find where the elliptic function is zero, 
using the asymptotic expression as an initial estimate. Note that the function 
optimset allows us to create a variable options that we can pass to fzero as a 
parameter, which controls the details of its execution. In this case, we turn off the 
output of intermediate results, and set the convergence tolerance to 10 -10 . 



Fig. 12.7. A comparison of the exact period of the elliptic function sn (f; k ) for k = \/l — 6 2 . 


Finally, by adding the solutions in the inner and outer regions and subtracting 
the matched part, 1 + |<5 2 — 2e _2t , we can obtain a composite expansion, uniformly 
valid for 0 ^ t < \t = log(4/<S) + 0(6 2 ), as 


x = tanh t + 2e 2t + -S 2 
4 


tanh t — tsech z t — cosh |log — 2f| 


0 (< 5 4 ). 


Example 2: A thermal ignition problem 

Many materials decompose to produce heat. This decomposition is usually more 
rapid the higher the temperature. This leads to the possibility of thermal ignition. 
As a material gets hotter, it releases heat more rapidly, which heats it more rapidly, 
and so on. This positive feedback mechanism can lead to the unwanted, and poten- 
tially disastrous, combustion of many common materials, ranging from packets of 
powdered milk to haystacks. The most common physical mechanism that can break 
this feedback loop is the diffusion of heat through the material and out through 
its surface. The rate of heat production due to decomposition is proportional to 
the volume of the material and the rate of heat loss from its surface proportional 
to surface area. For a sufficiently large volume of material, heat production dom- 
inates heat loss, and the material ignites. Determining the critical temperature 
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below which it is safe to store a potentially combustible material is an important 
and difficult problem. f 

We now want to develop a mathematical model of this problem. In Section 2.6.1, 
we showed how to derive the diffusion equation, (2.12), which governs the flow of 
heat in a body. We now need to include the effect of a chemical reaction that 
produces R(x,y, z,t) units of heat, per unit volume, per unit time, by adding a 
term R6t6xSy6z to the right hand side of (2.11). On taking the limit 6t, 8x, 8y, 

8z — > 0, we arrive at 


pc— = -V • Q + R, 

and hence, for a steady state solution (d/dt = 0), and since Fourier’s law of heat 
conduction is Q = -kVT, 

k\7 2 T + R = 0. (12.25) 

The rate of combustion of the material can be modelled using the Arrhenius law, 
R = Ae~ T */ T , where A is a constant and T a is the activation temperature, also 
a constant. It is important to note that T is the absolute temperature here. 
The Arrhenius law can be derived from first principles using statistical mechanics, 
although we will not attempt this here (see, for example, Flowers and Mendoza, 
1970). Figure 12.8 shows that the reaction rate is zero at absolute zero (T = 0), 
and remains small until T approaches the activation temperature, T a , when it 
increases, with R — > A as T — > oo. After defining u = T /T a and rescaling distances 



Fig. 12.8. The Arrhenius reaction rate law. 


f For more background on combustion theory, see Buckmaster and Ludford (1982). 
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with yjkj A, we arrive at the nonlinear partial differential equation 


V 2 u + e~ 1/u =0. 


For a uniform sphere of combustible material with a spherically-symmetric temper- 
ature distribution, this becomes 


subject to 


d 2 u 2 du i 

1 1- e~ ' 

dr 2 r dr 


= 0 , 


(12.26) 


— = 0 at r = 0, u = u a at r = r a . (12.27) 

dr 

Here, u a is the dimensionless absolute temperature of the surroundings and r a the 
dimensionless radius of the sphere. Note that, as we discussed earlier, we would 
expect that the larger r a , the smaller u a must be to prevent the ignition of the 
sphere. f A positive solution of this boundary value problem represents a possible 
steady state solution in which these two physical processes are in balance. If no 
such steady state solution exists, we can conclude that the material will ignite. A 
small trick that makes the study of this problem easier is to replace (12.27) with 


— = 0, «=e atr = 0. (12.28) 

dr 

We can then solve the initial value problem given by (12.26) and (12.28) for a given 
value of e, the dimensionless temperature at the centre of the sphere, and then 
determine the corresponding value of u a = u(r a ). Our task is therefore to construct 
an asymptotic solution of the initial value problem given by (12.26) and (12.28) 
when eCl. Note that by using the integrating factor r 2 , we can write (12.26) as 

^ = ~~ 2 f s 2 e~ 1/u ^ ds < 0, 
dr r z J 0 

and hence conclude that u is a monotone decreasing function of r. The temperature 
of the sphere decreases from centre to surface. 


Asymptotic analysis: Region I 

Since u = e at r = 0, we define a new variable u = u/e, with u = 0(1) for e <C 1. 
In terms of this variable, (12.26) and (12.28) become 


d 2 u 

dr 2 


2 du 
r dr 


1 

+ - exp 
e 



= 0 , 


subject to 


— = 0, u = 1 at r = 0. 
dr 


f For all the technical details of this problem, which was first studied by Gel’fand (1963), see 
Billingham (2000). 
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= 0, 


(12.29) 


At leading order, 

d 2 ii 2 du 
dr 2 r dr 

which has the trivial solution u = 1. We’re going to need more than this to be able 
to proceed, so let’s look for an asymptotic expansion of the form 

u = 1 + 4>i{e)u\ + 0 2 (e)M 2 + • • • , 

where 4>2 4 > i 1 are to be determined. As we shall see, we need a three-term 

asymptotic expansion to be able to determine the scalings for the next asymptotic 
region. 

Firstly, note that 


1 


1 


1 


- exp < - - (1 + <j>iUiY 


1 


1 


- exp < — (1 - 4>iui) 


1 — l/e 

-e ' exp 


—ur ) ~ -e _1 / £ ( 1 + — 


-u 1 


provided that (f>i <C e, which we can check later. Equation (12.26) then shows that 


d 2 u\ 2 du\ 

1 

dr 2 r dr 


■ 4*2 


d 2 U2 

dr 2 


2 dii2 
r dr 




-Ml 


In order to obtain a balance of terms, we therefore take 


h = -e 1/e < e, 1/e 0i = 2/e 

e e' 3 


and hence expand 


(12.30) 


u = 1 H — e 
e 


— l/e 


Ml 


4e- 2/e M 2 


Now, using (12.30), 


subject to 


d 2 Mi 

dr 2 


2 C?Mi 

r dr 


= -1, 


diii 

dr 


= Mi = 0 at r = 0. 


(12.31) 


Using the integrating factor r 2 , we find that the solution is Mi = — \r 2 . Similarly, 

d 2 ii2 2 dii2 1 


dr 2 


U2 _ j- <2 

r dr 6 


subject to 


du 2 

— — = m 2 = 0 at r = 0, 
dr 


and hence m 2 = jka 7 ' 4 - This means that 


1 - — p-Vv 2 4- 1 r ~ 2/e r 4 

6e e r + 120e 3 


(12.32) 
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for t«l. This expansion is not uniformly valid, since for sufficiently large r the 
second term is comparable to the first, when r = 0(e 1 / 2 e 1//2e ), and the third is 
comparable to the second, when r = 0(ee 1 ^ 2e ) <C e 1 / 2 e 1 / 2£ . As r increases, the first 
nonuniformity to occur is therefore when r = 0(ee 1 ^ 2e ) and u = e + 0(e 2 ). Note 
that this is why we needed a three-term expansion to work out the scalings for the 
next asymptotic region. 

What would have happened if we had taken only a two-term expansion and 
looked for a new asymptotic region with r = 0(e 1 / 2 e 1 / 2e ) and u = 0(e)? If you try 
this, you will find that it is impossible to match the solutions in the new region to the 
solutions in region I. After some thought, you notice that the equations at 0(1) and 
0(ie -1 / e ) in region I, (12.29) and (12.31), do not depend at all on the functional 
form of the term e -1 /™, which could be replaced by e _1 / £ without affecting (12.29) 
or (12.31). This is a sign that we need another term in the expansion in order to 
capture the effect of the only nonlinearity in the problem. 


Asymptotic analysis: Region II 

In this region we define scaled variables u = e + e 2 U, r = ee 1 / 26 /?, with U = 0(1) 
and R = 0(1) for tCl. At leading order, (12.26) becomes 


d 2 U 2 dU v _ 
lR 2+ RdR +e 
to be solved subject to the matching condition 

t/~--i? 2 as i?->0. 

6 


(12.33) 


(12.34) 


Equation (12.33) is nonlinear and nonautonomous, which usually means that we 
must resort to finding a numerical solution. However, we have seen in Chapter 10 
that we can often make progress using group theoretical methods. Equation (12.33) 
is invariant under the transformation U i— > U + k, R i— > e~ k ^ 2 R. We can therefore 
make the nonlinear transformation 


P(s) = e 


~ 2s e u , 


q{s) =2 + e R = e~ 


after which (12.33) and (12.34) become 


subject to 



dq 

ds 


= p + q- 2 , 


e 2s , q 



as s 


00 . 


(12.35) 


(12.36) 


The problem is still nonlinear, but is now autonomous, so we can use the phase 
plane methods that we studied in Chapter 9. 

There are two finite equilibrium points, at Pi = (0, 2) and P 2 = (2, 0) in the 
(jp,q) phase plane. After determining the Jacobian at each of these points and 
calculating the eigenvalues in the usual way, we find that Pi is a saddle point and 
P 2 is an unstable, anticlockwise spiral. Since (12.36) shows that we are interested 
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in a solution that asymptotes to Pi as s — > oo, this solution must be represented by 
one of the stable separatrices of P\. Furthermore, since p = e~ 2s e u > 0, the unique 
integral path that represents the solution is the stable separatrix of P\ that lies in 
the half plane p > 0. What happens to this separatrix as s — > — oo? A sensible, and 
correct, guess would be that it asymptotes to the unstable spiral at P 2 , as shown in 
Figure 12.9, for which we calculated the solution numerically using MATLAB (see 
Section 9.3.4). The proof that this is the case is Exercise 9.17, which comes with 
some hints. 



Fig. 12.9. The behaviour of the solution of (12.35) subject to (12.36) in the (p, q)-phase 
plane. 


Since the solution asymptotes to P 2 , we can determine its behaviour as s — > —00 
by considering the local solution there. The eigenvalues of P 2 are | (1 ± iy/7), and 
therefore 


P 


2 + Ae s/2 sin 



as s 


—00, 


for some constants A and B. Since U = 2s + logp and s 
U ~ — 2 log R + log 2 — sin f ^ log R — iA 


We conclude that 


= — log R, this shows that 
as R — > 00 . (12.37) 


u ~ e + e 2 (log 2 — 2 log R) as R — > 00 , 

for eCl. When logi? = 0(e~ 1 ), the second term is comparable to the first term, 
and we have a further nonuniformity. 
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Asymptotic analysis: Region III 

We define scaled variables u = eU and S = e log R = — | — e log e + e log r, 
r = ee 1 / 2e e s / e , with 17 = 0(1) and S = 0(1) for tCl. In terms of these variables, 
(12.26) becomes 

+ % + exp { \{~b + 1 + 25 ) } = °- (12 - 38) 

We expand U as 

U = U 0 + eUi -\ . 

To find the matching conditions, we write expansion (12.37) in region I in terms of 
the new variables and retain terms up to O(e), which gives 

t/ 0 ~ 1 — 25, C7i ~ log 2 as S' — > 0. (12.39) 

At leading order, the solution is given by the exponential term in (12.38) as 

= 1 + 2S’ 

which satisfies (12.39). At 0(e), we find that 

+exp j(l + 2S) 2 f?ij = 0, 

and hence that 

^ = (1 + 2S) 2 bg { (1 + 2S) 2 } ’ 
which also satisfies (12.39). We conclude that 

U = lT2S + (1 + 2S) 2 l0g { (1 + 2S) 2 } + ° (e3) ’ 

for S = 0(1) and tCl. This expansion remains uniform, with u — -> 0 as S — > oo, 
and hence R — > oo, and the solution is complete. 


We can now determine u a = u(r a ). If r a ee 1//2e , r = r a lies in region I, so that 
u a ~ e — ge _1 / e r 2 ~ e. In other words, for u a sufficiently small that r a <C Mae 1 / 2 ”*, 
a steady state solution is always available, and we predict that the sphere will not 
ignite. 

If r a = 0(ee 1 / 2£ ), we need to consider the solution in region II. The oscillations 
in p lead to oscillations in u a as a function of e, as shown in Figure 12.10. In 
particular, the fact that p < p max « 3.322, as can be seen in Figure 12.9, leads to 
a maximum value of u a for which a steady state solution is available, namely 

Marnax ~ £ + £ log I ^ J . (12.40) 


Finally, if r a = 0(ee 3 ^ 2lE ), the solution in region III shows that 

£ 


\ +2filog(r a /e) 


< «a 
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We conclude that w amax (the critical ignition temperature or critical storage 
temperature) gives an asymptotic approximation to the hottest ambient temper- 
ature at which a steady state solution exists, and hence at which the material will 
not ignite. 



Fig. 12.10. The ambient temperature, u a , as a function of e in region II when r a = 10 fl . 


12.2.4 The Method of Multiple Scales 

Let’s now return to solving a simple, linear, constant coefficient ordinary dif- 
ferential equation that, at first sight, seems like a regular perturbation problem. 
Consider 


to be solved subject to 


y + 2ey+y = 0, 


(12.41) 


2/(0) = 1, 2/(0) = 0, (12.42) 

for t ^ 0, where a dot denotes d/dt. Since this is an initial value problem, we can 
think of y developing as a function of time, t. As usual, we expand 


V = 2/o it) + £2/i (t) + 0(e 2 ) 

for e <C 1. At leading order, y 0 + y 0 = 0, subject to j/ 0 (0) = 1, 2/o(0) = 0, which has 
solution y 0 = cos t. At O(e), 

2/1 + 2/1 = - 


2 i/o = 2 sin t, 
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subject to yi(0) = 0, ?/i(0) = 0. After noting that the particular integral solution 
of this equation takes the form yi p = ktcost for some constant k , we find that 
yi = —t cos t + sin t. This means that 

y(t) ~ cos t + e (— f cos t + sin t ) (12.43) 

as e — > 0. As t — » 00 , the ratio of the second term to the first term in this expansion 
is asymptotic to et, and is therefore no longer small when t — 0(e _1 ). We conclude 
that the asymptotic solution given by (12.43) is only valid for t <C e” 1 . 

To see where the problem lies, let’s examine the exact solution of (12.41) subject 
to the boundary conditions (12.42), which is 



(12.44) 


As we can see from Figure 12.11, the solution is a decaying oscillation, with the de- 
cay occurring over a timescale of O (e^ 1 ). At leading order, (12.41) is an undamped, 
linear oscillator. The term 2ey represents the effect of weak damping, which slowly 
reduces the amplitude of the oscillation. The problem with the asymptotic expan- 
sion (12.43) is that, although it correctly captures the fact that e~ et ~ 1 — et for 
fCl and f we need to keep the exponential rather than its Taylor series 

expansion if we are to construct a solution that is valid when t = 0(e -1 ). Fig- 
ure 12.11 shows that the two-term asymptotic expansion, (12.43), rapidly becomes 
inaccurate once t = <3(e -1 ). 



Fig. 12.11. The exact solution, (12.44), two-term asymptotic solution, (12.43), and one- 
term multiple scales solution (12.50) of (12.41) when t = 0.1. 


The method of multiple scales, in its most basic form, consists of defining a 
new slow time variable, T = et, so that when t = 0(l/e), T = 0(1), and the 




12.2 ORDINARY DIFFERENTIAL EQUATIONS 


327 


slow decay can therefore be accounted for. We then look for an asymptotic solution 


V = Vo(t,T) + eyi(t,T) + 0(e 2 ), 


with each term a function of both t, to capture the oscillation, and T, to capture 
the slow decay. After noting that 

d d d 

dt dt ^ dT ’ 

(12.41) becomes 


d 2 y „ d 2 y 2 d 2 y „ dy 2 dy 

W + 2e didf +£ df* + 2e dt + 6 8 T + V = °’ 
to be solved subject to 


(12.45) 


2/(0, 0) = 1, |^(0,0) + ej|(0,0)=0. 


(12.46) 


At leading order, 


d 2 y 0 
dt 2 


+ 2/o — 0, 


subject to 


2/o(0, 0) = 1, f(0,0)=0. 


Although this is a partial differential equation, the only derivatives are with respect 
to t, so we can solve as if it were an ordinary differential equation in t. However, 
we must remember that the ‘constants’ of integration can actually be functions of 
the other variable, T . This means that 


j/o = A 0 (T) cost + B 0 {T) sinf, 
and the boundary conditions show that 


A o (0) = l, B 0 { 0) = 0. 


(12.47) 


The functions A)(T) and Bq(T) are, as yet, undetermined. 
At O(e), 


a 2 2/i 
dt 2 


2/i= - 


n d 2 y 0 

dtdT 



dAo 

dT 



sin t — 2 


dBo 

It 



cost. 


(12.48) 


Because of the appearance on the right hand side of the equation of the terms 
cost and sint, which are themselves solutions of the homogeneous version of the 
equation, the particular integral solution will involve the terms tsint and tcost. As 
we have seen, it is precisely terms of this type that lead to a nonuniformity in the 
asymptotic expansion when t = O^^ 1 ). The terms proportional to sint and cost 
in (12.48) are known as secular terms and, to keep the asymptotic expansion 
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uniform, we must eliminate them by choosing A 0 and B 0 appropriately, 
example, we need 


gL4 0 

~dT 


+ Aq — 0, 


dBo 

~dT 


Bn = 0 , 


In this 
(12.49) 


and hence 


yi = A\ ( T ) cos t + B i ( T ) sin t. 


The initial conditions for (12.49) are given by (12.47), so the solutions are Aq = e T 
and B 0 = 0. We conclude that 


y ~ e T cos t = e et cos t 


(12.50) 


for e< 1. This asymptotic solution is consistent with the exact solution, (12.44), 
and remains valid when t = 0(e _1 ), as can be seen in Figure 12.11. In fact, we will 
see below that this solution is only valid for 1 <C r 2 . 

In order to proceed to find more terms in the asymptotic expansion using the 
method of multiple scales, we can take the exact solution, (12.44), as a guide. We 
know that 

cos ^1 — 

for e <C 1. This shows that the phase of the oscillation changes by an 0(1) amount 
when t = 0(e~ 2 ). In order to capture this, we seek a solution that is a function of 
the two timescales 


t + e sin ( 1 — 


(12.51) 



T = et, r = (l + ae 2 + be 3 + • • • ) t, 

with the constants a,b, . . . to be determined. Although this looks like a bit of a 
cheat, since we are only doing this because we know the exact solution, this approach 
works for a wide range of problems, including nonlinear differential equations. 

In order to develop a one-term multiple scale expansion, we needed to consider 
the solution up to O(e). This suggests that we will need to expand up to 0(e 2 ) to 
construct a two-term multiple scales expansion, with 

V = 2/o (f, T) + ey\ (t, T) + e 2 y 2 (r,T) + 0(e 3 ). 


After noting that 


d_ _ d_ 
dt ~ 6 df 


+ (1 + ae 2 


+ be 3 -\ ) 


d_ 

frr' 


equation (12.41) becomes 


A ,, 2 9 2 y 

dr 2 + ae 8 t°- 


2e 


d 2 y 2 d2 V , 9 dy 2 dy 3 

ihSf + ' af» + + 2e 3T + 9 + 0(f > = °’ 


(12.52) 


to be solved subject to 


2 /( 0 , 0 ) 


1, (l + ae 2 + 6e 3 + • • • ) — ^(0, 0) + e ^( 0, 0) — 0. 


(12.53) 


We already know that 


yo = e T cost, y\ = A\ (T) cos r + Bi (T) sin r. 
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At 0(e 2 ), 


d 2 V2 _ _ 9 cPyo _ 9 d 2 j/i <9^/o _ ctyi _ ctyo 

dr 2 V2 ^ 2 z dT g T + q T 2 z g T z g T 


= 2 (if + - 4 ‘) “ ,T - 2 {^ + Bl - (“ + 0 e " T > COST - 

In order to remove the secular terms we need 

dAj dBi ( 1\ _ T 

~HT + Bl -\ a+ 2) e • 

At 0(e) the boundary conditions are 

2/! (0,0) = Ar(i 0) = 0, ^(0,0) + §£(0,0) = B 1 ( 0) -1 = 0, 


and hence 


This gives us 


A-i — 0, Bi — ( a + - ) Te T 


—T 


y ' 


e T cos r + e 


a + - ) Te T + e 


-T 


sm t. 


However, the part of the O(e) term proportional to T will lead to a nonuniformity 
in the expansion when T = and we must therefore remove it by choosing 

a = —1/2. We could have deduced this directly from the differential equation for 
Bi, since the term proportional to e~ T is secular. We conclude that 


r = | 1 — ^e 2 


t, 


and hence obtain (12.51), as expected. 


Example 1: The van der Pol Oscillator 
The governing equation for the van der Pol oscillator is 

^ + £(2/2_1) f +2/ = 0 ’ (12 - 54) 

for t ^ 0. For eCl this is a linear oscillator with a weak nonlinear term, e(y 2 — 1 )y. 
For \y\ < 1 this term tends to drive the oscillations to a greater amplitude, whilst 
for \y\ > 1, this term damps the oscillations. It is straightforward (at least if you’re 
an electrical engineer!) to build an electronic circuit from resistors and capacitors 
whose output is governed by (12.54). It was in this context that this system was first 
studied extensively as a prototypical nonlinear oscillator. It is also straightforward 
to construct a forced van der Pol oscillator, which leads to a nonzero right hand 
side in (12.54), and study the chaotic response of the circuit. 

Since the damping is zero when y = 1, a reasonable guess at the behaviour of 
the solution for e < 1 would be that there is an oscillatory solution with unit 
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amplitude. Let’s investigate this plausible, but incorrect, guess by considering the 
solution with initial conditions 

2/(0) = 1, |(0) = 0. (12.55) 

We will use the method of multiple scales, and define a slow time scale, T = et. We 
seek an asymptotic solution of the form 

V = Vo(t,T) + eyi(t,T) + 0(e 2 ). 

As for our linear example, we find that 

2/o = A 0 {T) cos {t + </>o(T)} . 

For this problem, it is more convenient to write the solution in terms of an ampli- 
tude, A 0 (T), and phase, 0o (T). The boundary conditions show that 

A o (0) = 1, </> o (0) = 0. (12.56) 


At O(e), 


9 2 yi 2 d 2 yo 

dt 2 Vl dtdT 


(?/o 


1) 


dyo 

dt 


= 2^^ sin (t + <j) 0 ) + 2A 0 ^ cos (t + <j) 0 ) + A 0 sin (t + <j) 0 ) {A 2 0 cos 2 (t + </> 0 ) - l} . 

In order to pick out the secular terms on the right hand side, we note thatf 

3 111 

sin 9 cos 2 9 = sin 9 — sin 3 9 = sin 9 sin 9 H — sin 3 9 = - sin 9 H — sin 39. 

4 4 4 4 

This means that 
d 2 y 
dt 2 


- + 2/i — — ^oj’ sin (t + </>o) + 2A$-^ cos (t + <f>o) + -Aosin3(t + (j> 0 ). 


To suppress the secular terms we therefore need 

^“=0, ^ = -A 0 (4-A 2 ). 

dT ’ dT 8 V 01 

Subject to the boundary conditions (12.56), the solutions are 

</>o = 0, A 0 = 2(1 + 3e T ) 1 ^ 2 . 


Therefore 


y ~ 2(1 + 3e T ) 1 / 2 cost 

for e <C 1, and we conclude that the amplitude of the oscillation actually tends to 
2 as f — > 00 , as shown in Figure 12.12. 

f To get cos 71 9 in terms of cos m9 for m = 1,2,... , n, use e lrl8 = cos n6 + isinnO = (e 1 ®) 71 = 
(cos 6 + / sin 0) TL and equate real and imaginary parts. 
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Fig. 12.12. The leading order solution of the van der Pol equation, (12.54), subject to 
y( 0) = 1, y( 0) = 0, when e = 0.1. 


Example 2: Jacobian elliptic functions with almost simple harmonic behaviour 

Let’s again turn our attention to the Jacobian elliptic function x = sn (t; k), which 
satisfies (12.21). When k <C 1, this function is oscillatory and, at leading order, 
performs simple harmonic motion. We can see this more clearly by differentiating 
(12.21) and eliminating dx/dt to obtain 


d 2 x 

dt 2 


(1 + k 2 ) x = 2 k 2 x 3 . 


(12.57) 


The initial conditions, of which there must be two for this second order equation, 
are x = 0 and, from (12.21), dx/dt = 1 when t = 0. Let’s now use the method 
of multiple scales to see how this small perturbation affects the oscillation after a 
long time. As usual, we define T = k 2 t and x = x(t,T), in terms of which (12.57) 
becomes 


d 2 x o d 2 x 

i o b 2 

dt 2 OtdT 


r) 2 r 

+ kA Q2f + (! + h 2 ) x = 2fc 


2^,3 


(12.58) 


We seek an asymptotic solution 


x = Xq ( t, T) + k 2 Xi(t, T) + 0(k 4 ). 


d 2 x 0 
dt. 2 


+ xq — 0 , 


At leading order 
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subject to a;o(0,0) = 0 and ^(0,0) = 1. This has solution 

£ 0 = A(T) sin { t + 4>{t)} , 

and 

A(0) = 1, 0(0) =0. 

At 0(k 2 ), 

d2n +Xl = - 2 a2l '“ 


(12.59) 


dt 2 


dtdT 


dA 


dip 


= 2A J siiu(i + 0) — Asin(t + 0) — 2— cos(f + 0) + 2 A-^ sin(i + 0) 


dT 


= Qa 3 -A+2A^pj sin(t + 0) -2^cos(< + 0) - ^A 3 sin3(f + 0). 

In order to remove the secular terms, we must set the coefficients of sin(i + 0) and 
cos(t + 0) to zero. This gives us two simple ordinary differential equations to be 
solved subject to (12.59), which gives 

^ = 1, 0 = -jT, 

and hence 

£ = sin | ^1 — ^ | 0(fc 2 ), 

for k <C 1. We can see that, as we would expect from the analysis given in Sec- 
tion 9.4, the leading order amplitude of the oscillation does not change with t, in 
contrast to the solution of the van der Pol equation that we studied earlier. How- 
ever, the period of the oscillation changes by 0(k 2 ) even at this lowest order. If 
we take the analysis to 0(/c 4 ), we find that the amplitude of the oscillation is also 
dependent on k (see King, 1988). 


12.2.5 Slowly Damped Nonlinear Oscillations: Kuzmak’s Method 

The method of multiple scales, as we have described it above, is appropriate for 
weakly perturbed linear oscillators. Can we make any progress if the leading order 
problem is nonlinear! We will concentrate on the simple example, 

g + *£ + ,-,-o. <™o) 

subject to 

2/(0) =0, |(0)=, 0 >0, (12.61) 

with e small and positive. Let’s begin by considering the leading order problem, 
with e = 0. As we saw in Chapter 9, since dy/dt does not appear in (12.60) when 
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e = 0, we can integrate once to obtain 



E-y 2 + -y A , 


(12.62) 


with E = v § from the initial conditions, (12.61). If we now assume that Vq < l/2f, 
and scale y and t using 


= (l - Vl - 2E} 


a 1/2 


y, t = 


l-y/l-2 E\ 
E ) 


1/2 


t, 


(12.62) becomes 

at 

where 

\1 + V1^2e) ' 


(12.63) 


(12.64) 


If we compare (12.63) with the system that we studied in Section 9.4, we find that 

v = (i ~ '/i - 2E ) I/2 ” { *;*}• < 1265) 


In the absence of any damping (e = 0), y varies periodically in a manner described 
by the Jacobian elliptic function sn. In addition, Example 2 in the previous section 
shows that y ~ Vq sin t when vq <C 1. This is to be expected, since the nonlinear 
term in (12.60) is negligible when Vq, and hence y, is small. 

For e small and positive, but nonzero, by analogy with what we found using the 
method of multiple scales in the previous section, we expect that weak damping 
leads to a slow decrease in the amplitude of the oscillation, and possibly some change 
in its phase. In order to construct an asymptotic solution valid for t = 0(e -1 ), when 
the amplitude and phase of the oscillation have changed significantly, we begin in 
the usual way by defining a slow time scale, T = et. However, for a nonlinear 
oscillator, the frequency of the leading order solution depends upon the amplitude 
of the oscillation, so it is now convenient to define 


and seek a solution y 
not in full generality. 
Since 


V,= ffi + <(>(T), 0(0) = 0, 

e 

y(ip,T). This was first done by Kuzmak (1959), although 


dy 

dt 


(0' + e0O 


dy dy 
dxj} +C dT' 


where a prime denotes d/dT, we can see that 6'(T) = u>(T) is the frequency of the 
oscillation at leading order and (f>(T ) the change in phase, both of which we must 


f The usual graphical argument shows that this is a sufficient condition for the solution to be 
periodic in t. 
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determine as part of the solution. In terms of these new variables, (12.60) and 
(12.61) become 

, , ,/x 2 0 2 y , d 2 y 2 d 2 y , „ dy 

^ ) ATa + 2e (w + ecp ) — — + e — - + e (w + ecp ) — 


subject to 


v ^ ' dip 2 v ^ ' <9Td0 dT 2 v 
+2e (w + e0') g + 2e2 ^ + 2/ “ 2/ 3 = 

3 

y(0, 0) = 0, (u,(0) + e0'(O)) g (0, 0) + eg, (0, 0) = *„. 


We now expand 

2/ = Vo{^,T) + eyi(ip, T) + 0(e 2 ), 

and substitute into (12.66) and (12.67). At leading order, we obtain 

2 rrri\d 2 y0 3 n 

w ( T ) 1772- + yo ~ Vo =0, 


( 12 . 66 ) 


(12.67) 


( 12 . 68 ) 


subject to 


2/ o (0,0) = 0, w(0)^(0,0)=v 0 . 


As we have seen, this has solution 
Vo = (l \/ 1 — 2E(T) S j 1/2 sn / ^ 


1 - y/1 - 2 E(T) I w(T) 


(12.69) 


;k(T)\, (12.70) 


where k is given by (12.64). The initial conditions, (12.69), show that 

E(0) =vl 0(0) = 0. 


(12.71) 


The period of the oscillation, P(T), can be determined by noting that the quarter 
period is 


- [ P(T)=w(T)J 


(l-y/l-2 E(T)Y 


' E{T) - s 2 + is 4 


which, after a simple rescaling, shows that 


P{T) = 4 w(T) 


i-s/T^mT) 


K(k(E)), 


where 


K(k) = f 
Jo 


— s 2 V 1 — k 2 s 2 


(12.72) 


(12.73) 


is the complete elliptic integral of the first kind. 

As in the method of multiple scales, we need to go to 0(e ) to determine the 
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behaviour of the solution on the slow time scale, T. Firstly, note that we have 
three unknowns to determine, E(T), <p(T) and u>(T ), whilst for the method of 
multiple scales, we had just two, equivalent to E(T) and <p(T). Since we introduced 
6{T) simply to account for the fact that the period of the oscillation changes with 
the amplitude, we have the freedom to choose this new time scale so that the period 
of the oscillation is constant. For convenience, we take P = 1, so that 

HQ i / t? \ 1/2 

5f = “ s “' E) = ««E))(rT7r^l) ■ (12J4) 


We also need to note for future reference the parity with respect to ip of j/o and 
its derivatives. Both go and d 2 yo/dip 2 are odd functions, whilst dyo/dip is an even 
function of ip. In addition, we can now treat yo as a function of ip and E, with 


y 0 (ip, E)=(l- Vl - 2 p) V2 sn {4 K{k{E))ip ; k(E)} . (12.75) 

At 0(e) we obtain 

2 _ o , A i d2 yo o d2 y° dE ' d y° o d y° 
dil’ 2+ ( y ^ Vl ^ dip 2 ^ dipdE dT U dip 2u dip' 


which we write as 


where 


and 


Plodd 4“ ^leveni 




O o ^d 2 yo u o d 2 y 0 dE ,dy 0 0 dy 0 

Rlodd ~ * dip 2 ’ Rleven ~ w diPdE dT u dip dip' 

Now, by differentiating (12.68) with respect to ip, we find that 

L I ) = °, 

dip 


(12.76) 

(12.77) 

(12.78) 


so that dyo/dip is a solution of the homogeneous version of (12.76). For the solution 
of (12.76) to be periodic, the right hand side must be orthogonal to the solution of 
the homogeneous equation, and therefore orthogonal to dyo/ dip. \ This is equivalent 
to the elimination of secular terms in the method of multiple scales. Since dyo / dip 
is even in ip, this means that 

[ ^Pleven# = 0, 

Jo dJ’ 

which is the equivalent of the secularity condition that determines the amplitude 


f Strictly speaking, this is a result that arises from Floquet theory, the theory of linear ordinary 
differential equations with periodic coefficients. 



336 


ASYMPTOTIC METHODS: DIFFERENTIAL EQUATIONS 


of the oscillation in the method of multiple scales. Using (12.78), we can write this 
asf 


dip 


f 1 (dyo 
lo \dip 

v / 

To proceed, we firstly note that 


d 

dE 


dE 

dT 


■ 2 to 


dy 0 

dip 




n(l-v'l-2 E) 1/2 


dip 


4 /-(l -V1-2E) 


dip = 0. 


dyo , 


1/2 


E~Vo + 2 ^o 


and hence that 


d 

We 


1 ( dyo 
dip 


dip > = 2 


n(l-v / l-2 E) 


1/2 


dyo 


(12.79) 


(12.80) 


' E — Vq + 2 do 
Secondly, using the results of Section 9.4, we find that 

= 16 K 2 (E) (l — Vl — 2 i?) J {l — sn 2 (4 Kip ; At) } { 1 — k 2 sn 2 (4 Kip ; fc)} dip 


16 K(E) (l - Vl - 2E^j 


— sn 2 (ip; k ) } { 1 — fc 2 sn 2 ( ip ; k ) } dip. 


We now need a standard result on elliptic functions):, namely that 



— sn 2 (ip; k ) } { 1 — fc 2 sn 2 (ip ; k)} dip 


= ^{(i + k 2 )m-(i~k 2 )K(k)} 1 

where 

m= L 

is the complete elliptic integral of the second kind§. Equation (12.79) and 
the definition of u>, (12.74), then show that 

§ = ~Wk {(1 + 12) m - 1 1 - **> *<*» ■ < 12 - 81 > 

f Note that the quantity in the curly brackets in (12.79) is often referred to as the action. 
i See, for example, Byrd (1971). 

§ Although the usual notation for the complete elliptic integral of the second kind is E(k), the 
symbol E is already spoken for in this analysis. 
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This equation, along with the initial condition, (12.71), determines E(T). For vo <C 
1, this reduces to the multiple scales result, dE/dT = —2 E (see Exercise 12.10). It 
is straightforward to integrate (12.81) using MATLAB, since the complete elliptic 
integrals of the first and second kinds can be calculated using the built-in function 
ellipke. Note that we can simultaneously calculate 9{T) numerically by solving 
(12.74) subject to 0(0) = 0. Figure 12.13 shows the function E{T) calculated 
for E( 0) = 0.45, and also the corresponding result using multiple scales on the 
linearized problem, E = E( 0)e -2T . 



Fig. 12.13. The solution of (12.81) when E{ 0) = 0.45, and the corresponding multiple 
scales solution of the linearized problem, E(T) = E( 0)e~ 2T . 


We now need to find an equation that determines the phase, (j>(T ). Unfortunately, 
unlike the method of multiple scales, we need to determine the solution at O(e) in 
order to make progress. By differentiating (12.68) with respect to E, we find that 


We also note that 



duj d 2 y 0 
dE di/) 2 



2 w 2 


d 2 y 0 
8V ’ 


(12.82) 
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and hence that 


where 


A (l/Oodd) — 0: 


dun dlU , d'Un 


(12.83) 


The general solution of the homogeneous version of (12.76) is therefore a linear 
combination of the even function dyo/dip and the odd function yoodd- To find the 
particular integral solution of (12.76), we note from (12.82) that 

T ( <t>' , 9yo \ 9 ,, d 2 y 0 

L ^ = _2w< ^ TTJ = ^lodd- 

\ u ; O'lp J o^p z 


We therefore have 

Afrr\ d y° , i3frr\ ( d y° , duj i d y°\ $ i d y° 
K = ■ 4(T) ^ + B(T) ("fiE + dE i ’ W ) ~ W 


y le 


(12.84) 


where yi eV en is the part of the particular integral generated by i?i even , and is itself 
even in ip. For y\ to be bounded as ip oo, the coefficient of ip must be zero. 
Noting from (12.75) that 

dyo . dK dy 0 

dE~ i dE i ’W’ 


this means that 


B(T) = 


cP'/u 


(12.85) 


AujdK / dE + duj/ dE 

The easiest way to proceed is to multiply (12.66) by dy/dip , and integrate over 
one period of the oscillation. After taking into account the parity of the components 
of each integrand, we find that 


(I) #}=o. 


At leading order, this reproduces (12.79), whilst at 0(e) we find that 

nl ditr. rb/, b 1 / ditn\ 2 \ 1 

dip ) > = 0. 


dT 


■vi 2 ui ^^+4,' r tdy ° 


1 0 dip dip 


dip 


(12.86) 


Now, using what we know about the parity of the various components of ij \ , we can 
show that 

2 


rl dyo d Vl 
— — # = 


<P' f 1 dy 0 d 2 y 0 


J o dip dip dio/dE J 0 dip dipdE 
and hence from (12.86) that 


dip — 


d 


2dco/dEdEj 0 \dip 


1 /d y°\ rt l 

"oT 


d 

dT 


JIT 


<t>' d j f 1 ( dyo \ 

dw/dEdE] J 0 \ dip ^ ^ 1 


= 0. 
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Using (12.80), this becomes 


d / e 2T <f>' \ 
dT \(jjduj/dE ) 


= 0 . 


(12.87) 


This is a second order equation for <j>(T). Although we know that <j>( 0) = 0, to be 
able to solve (12.87) we also need to know </>'(0). 

At 0(e), the initial conditions, (12.67), are 


2d (0,0) = 0, u(E( 0))§|(0,0) = -E' (0)|g(0, E(0)) - ^(0)^(0, E(0)). 


( 12 . 88 ) 


By substituting the solution (12.84) for yi into the second of these, we find that 
co(E(0)) | A(0) (0, E(0)) + B(0)u>{E{0)) (0, E(0)) 


+ - Ji)) O E <°» + ^ 1 :»■ °»} 

= (0)g|(0, 15(0)) - ^(0)^(0, U(0)). 

Using (12.85), all of the terms that do not involve ^(O) are odd in ip, and therefore 
vanish when ip = 0. We conclude that <j>'( 0) = 0, and hence from (12.87) that 

m = o. 

Figure 12.14 shows a comparison between the leading order solution computed 
using Kuzmak’s method, the leading order multiple scales solution of the linearized 
problem, y = sin t, and the numerical solution of the full problem when 

e = 0.01. The numerical and Kuzmak solutions are indistinguishable. Although 
the multiple scales solution gives a good estimate of E(T), as shown in figure 12.13, 
it does not give an accurate solution of the full problem. 

To see how the method works in general for damped nonlinear oscillators, the 
interested reader is referred to Bourland and Haberman (1988). 


12.2.6 The Effect of Fine Scale Structure on Reaction Diffusion 
Processes 

Consider the two-point boundary value problem 

-j- |-D [ x , — ^ + R (o, x, — j = 0 for 0 < x < 1, (12.89) 

subject to the boundary conditions 

d0 

— = 0 at x = 0 and x = 1. (12.90) 

dx 

We can think of 9(x) as a dimensionless temperature, so that this boundary value 
problem models the steady distribution of heat in a conducting material. When 
e <C 1, this material has a fine scale structure varying on a length scale of O(e), and 
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Fig. 12.14. The numerical solution of (12.60) subject to (12.61) when Vq = 0.45 and 
e = 0.01 compared to asymptotic solutions calculated using the leading order Kuzmak 
solution and the leading order multiple scales solution of the linearized problem. 


a coarse, background structure varying on a length scale of 0(1). The steady state 
temperature distribution represents a balance between the diffusion of heat, {D6 X ) X 
(where subscripts denote derivatives), and its production by some chemical reaction, 
R (see Section 12.2.3, Example 2 for a derivation of this type of balance law). If 
we integrate (12.89) over 0 ^ x ^ 1 and apply the boundary conditions (12.90), we 
find that the chemical reaction term must satisfy the solvability condition 


R 


(0( x),x, 


x, — ) dx = 0 
e . 


(12.91) 


for a solution to exist. Physically, since (12.90) states that no heat can escape from 
the ends of the material, (12.91) simply says that the overall rate at which heat 
is produced must be zero for a steady state to be possible, with sources of heat in 
some regions of the material balanced by heat sinks in other regions. 

In order to use asymptotic methods to determine the solution at leading order 
when e < 1, we begin by introducing the fast variable, x = x/e. In the usual way 
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(see Section 12.2.4), (12.89) and (12.90) become 

(D9 X ) £ + e {{D9x) x + ( D9 X ) -} + e 2 {( D9 X ) X + R} = 0, (12.92) 

subject to 

9 X + t9 x = 0 at x = x = 0 and x = 1, x = 1/e, (12.93) 

with R = R(9(x, x),x,x), D = D(x,x). It is quite awkward to apply a boundary 
condition at a: = 1/e ^ 1, but we shall see that we can determine the equation 
satisfied by 9 at leading order without using this, and we will not consider it below. 
We now expand 9 as 

9 = 9 0 (x,x) + e9 1 (x,x) + e 2 9 2 (x,x ) + 0(e 3 ). 


At leading order, 


(D9 0 xh = 


subject to 


9 0 £ = 0 at x = x = 0. 


We can integrate this once to obtain D9 0x = a(x), with a(0) = 0, and then once 
more, which gives 


r ds 

e ° = a(x) J„ WT) +Mx) - 


At O(e), we find that 


(D9 1£ ). = - (D9 0& ) x - (D9 0x ) & , 


which we can write as 


(D9 lx + D9 0x ) & = -a'(x). (12.94) 

We can integrate (12.94) to obtain 

D9\ x + D9 0x = -a'(x)x + (3(x). 

Substituting for 9 0 and integrating again shows that 

0i = h( x ) - fo( x )x ~ a'(x)x f 

Jo D(x,s) 


—a(x) 


d 

dx 



x — s 
D(x, s) 


ds + (3{x) 



ds 

D(x, s ) 


(12.95) 


When x is large, there are terms in this expression that are of 0{x 2 ). These are 
secular, and become comparable with the leading order term in the expansion for 
9 when x = 1/e. We must eliminate this secularity by taking 


lim 

e — *0 



ds 

D(x, s ) 


a(x)e 2 


d_ 

dx 



1/e — s 
D(x, s ) 



= 0 . 


(12.96) 


This is a first order ordinary differential equation for a(x). Since ct(0) = 0, we 
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conclude that a = 0. Note that each of the terms in (12.96) is of 0(1) for t< 1. 
For example, e ^ is the mean value of 1/D over the fine spatial scale. We 

now have simply 9q = fo(x). This means that, at leading order, 9 varies only on 
the coarse, 0(1) length scale. This does not mean that the fine scale structure has 
no effect, as we shall see. 

Since we now know that 9q = 0(1) when x = 1/e, we must also eliminate the 
secular terms that are of O(x) when x is large in (12.95). We therefore require that 


f / d9 0 . 

Jo = -7- = I™ 


dx 


e — *0 



ds 1 
D(x,s) J ' 


(12.97) 


In order to determine (3, and hence an equation satisfied by 9q, we must consider 
one further order in the expansion. 

At 0(e 2 ), (12.92) gives 


{D9 2i + D0 lx ). = - (D9 1£ ) x - ( D9 0x ) x - R(9 0 (x),x,x) 


= ~/3'(x) - R(9 0 (x),x,x), 


which we can integrate twice to obtain 


02 


/ 2 O) - f[(x)x + 


1 d 2 9 0 

2 lh? X 


d 

dx 



x — s 
D(x, s) 



(12.98) 



ds 

D(x , s) 


~0'(x) / JX 7 \ds- / — / R(9 0 (x),x,u)duds. (12.99) 

Jo D(x,s ) Jq D(x,s) J Q 

In order to eliminate the secular terms of 0(a: 2 ), we therefore require that 

l d 2 9o f 1/£ lA-». 


lim 

e — *0 


2 dx 2 




x)t 


fl/e 


-ds — e 2 


rl/e 


1 


/ 0 D{ X, s) Jo D{x, s ) Jo 

If we now use (12.97) to eliminate f3(x), we arrive at 


D(x,s) J 
R( 9 q(x),x, u) du ds 


, d 2 9 0 


9 

^ dx 


(« 2 /, 


1/ e s 

0 D(x,s ) 


ds ) d9 0 


e/c 


!/« ds 


0 D(x,s ) 


dx 


= 0. 


+2e^ 


rl/e 1 rs I 

/ — r / R(9o(x), x, u) duds > = 0. 

Jo D[x, s) Jo 


After multiplying through by a suitable integrating factor, we can see that 9o sat- 
isfies the reaction-diffusion equation 

{D(x)0 0x } x + R(9 0 (x),x) = 0, 


(12.100) 
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on the coarse scale, where 

D(x) = lim | exp { —2 

R(0q(x), a;) = lim 2 D(x)e 
£^0 


d 

dX 


r* 1/e 


(f fo /e D(ks) dS ) 


tfc 


!/ e ds 


dX 


0 D(X,a) 


2 I — / R(9o(x), x, u) duds 

o D(x,s) J Q 


(12.101) 

(12.102) 


are the fine scale averages. 

This asymptotic analysis, which is called homogenization, shows that the lead- 
ing order temperature does not vary on the fine scale, and satisfies a standard 
reaction-diffusion equation on the coarse scale. However, the fine scale structure 
modifies the reaction term and diffusion coefficient through (12.101) and (12.102), 
with D the homogenized diffusion coefficient and R the homogenized reac- 
tion term. If we were to seek higher order corrections, we would find that there 
are variations in the temperature on the fine scale, but that these are at most of 
0(e). 

One case where D and R take a particularly simple form is when D{x,x) = 
D 0 (x)D(x) and R(9(x), x, x) = R 0 {9(x),x)R(x). On substituting these into (12.101) 
and (12.102), we find, after cancelling a constant common factor between D and 
R, that we can use D(x) = D 0 (x) and R(0,x ) = KR o (0,x ), where 

K = lim { 2e [ — [ R(u) du ds 

^0 \ J Q £>( s ) J o 

The homogenized diffusion coefficient and reaction term are simply given by the 
terms D${x) and i?o(^,®), modified by a measure of the mean value of the ratio 
of the fine scale variation of each, given by the constant K . In particular, when 
D(x) = 1/ (1 + A\ sin k\x) and R(x) = 1 + A 2 sin for some positive constants 
ki,k 2 ,Ai and H 2 , with A\ < 1 and H 2 < 1, we find that K = 1. We can illustrate 
this for a simple case where it is straightforward to find the exact solution of both 
(12.89) and (12.100) analytically. Figure 12.15 shows a comparison between the 
exact and asymptotic solutions for various values of e when fci = fc 2 = 1, A\ = 
A 2 = 1/2, D 0 (x) = 1/(1 + x) and Rq = 2x — 1. The analytical solution of (12.89) 
that vanishes at x = 0 (the solution would otherwise contain an arbitrary constant 
in this case) is 


6 = 




^(1 — 2x) 





1 

16 


+e 3 





whilst the corresponding solution of (12.100) correctly reproduces the leading order 
part of this. 

Homogenization has been used successfully in many more challenging appli- 
cations than this linear, steady state reaction diffusion equation, for example, 
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Fig. 12.15. Analytical and asymptotic solutions of (12.89) when D = l/(l+x) (l + I sin a;) 
and R = ( 2x — 1) (l + | sinx). 


in assessing the strength of elastic media with small carbon fibre reinforcements 
(Bakhvalov and Panasenko, 1989). 


12.2.7 The WKB Approximation 

In all of the examples that we have seen so far, we have used an expansion in 
algebraic powers of a small parameter to develop perturbation series solutions of dif- 
ferential or algebraic equations. This procedure has been at its most complex when 
we have needed to match a slowly varying outer solution with an exponentially- 
rapidly varying, or dissipative, boundary layer. This procedure doesn’t always 
work! For example, if we consider the two-point boundary value problem 

e 2 y"( x) + (j>{x)y{x) = 0 subject to y(0) = 0 and y(l) = 1, (12.103) 

the procedure fails as there are no terms to balance with the leading order term, 
cf>(x)y(x). If there were a first derivative term in this problem, the procedure would 
work, although we would have a singular perturbation problem. However, a first 
derivative term can always be removed. Suppose that we have a differential equation 
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of the form 


w" + p(x)w' + q(x)w = 0. 

By writing w = Wu, we can easily show that 

W" + (—+p) W'+ f—+p-+q)w = 0. 

\ u J \ u u J 

By choosing 2 " +p = 0 and hence u = exp { — | J x p(t) df}, we can remove the first 
derivative term. Because of this, there is a sense in which e 2 y" +4>y = 0 is a generic 
second order ordinary differential equation, and we need to develop a perturbation 
method that can deal with it. 


The Basic Expansion 

A suitable method was proposed by Wentzel, Kramers and Brillioun (and perhaps 
others as well) in the 1920s. The appropriate asymptotic development is of the 
form 

y = exp | + O(e) j . 

Differentiating this gives 

y = exp + ^i j jy + Wi + 0(e) | , 

and 

y" = exp | + V’l + 0(e) | | + ip" (x) + O(e) | 

+ exp {^ + ^ + O(e) | + ip[ + 0(e) 

= ex P | ~ + i’l 1 1 ~ (2ip' 0 ip[ + V’o ) + o(i) | . 

If we substitute these into (12.103), we obtain 

Wo) 2 + 6 (2 ip'oipi + ip'o) + <P(x) + 0(e 2 ) = 0, 

and hence 

Wo ) 2 = ip[ = (iog'WW (12.104) 

If <p(x) > 0, say for x > 0, we can simply integrate these equations to obtain 

/ x y 

<(W 2 (t) dt + constant, ip\ = — - log {(p{x)) + constant. 
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All of these constants of integration can be absorbed into a single constant, which 
may depend upon e, in front of the exponential. The leading order solution is 
rapidly oscillating, or dispersive, in this case, and can be written as 


V = 


Me) 


exp 


J X <p 1/2 (t ) dt 


B{e) 

(/> 1 /4(x) ^ 


i- 

e 


(12.105) 


Note that this asymptotic development has assumed that x = 0(1), and, in order 
that it remains uniform, we must have ip\ <C ip o/e. 

If <p(x) < 0, say for x < 0, then some minor modifications give an exponential, 
or dissipative, solution of the form, 


y = 


C(e) 

\p{x)\ 1/A 


exp 


ir\m l/2 dt) 

1 ^ D(e) I 

f nm i/2 dt\ 

{ * j 

ri#r)|i/4 exp i 

{ E J 


(12.106) 


If <p(x) is of one sign in the domain of solution, one or other of the expansions 
(12.105) and (12.106) will be appropriate. If, however, <p(x) changes sign, we will 
have an oscillatory solution on one side of the zero and an exponentially growing or 
decaying solution on the other. We will consider how to deal with this combination 
of dispersive and dissipative behaviour after studying a couple of examples. 


Example 1: Bessel functions for x 1 
We saw in Chapter 3 that Bessel’s equation is 


y 


-y' + ( 1 - ^ ) y = o. 


If we make the transformation y = x 1//2 K, we obtain the generic form of the 
equation, 


Y" + 1 




Y = 0. 


(12.107) 


Although this equation currently contains no small parameter, we can introduce 
one in a useful way by defining x = 8x. If x is large and positive, we can have 
x = 0(1) by choosing 6 sufficiently small. We have introduced this artificial small 
parameter as a device to help us determine how the Bessel function behaves for 
x 1, and it cannot appear in the final result when we change variables back from 
x to x. 

In terms of x and Y(x) = Y(x), (12.107) becomes 


8 2 Y" + 1 + 6 


1 — £/ 2 
c2 2 v 


Y = 0. 


(12.108) 


By direct comparison with the derivation of the WKB expansion above, in which 
we neglected terms of 0(8 2 ), 

r _ i 

ipo = ±i / dt = ±ix , ip i = —- log 1 = 0, 
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so that 

Y ~ A(S) exp + B(6) ex P H > 

and hence 

y ~ ( A ‘“ + s »““) 

is the required expansion. Note that, although we can clearly see that the Bessel 
functions are slowly decaying, oscillatory functions of a: as a: — > oo, we cannot 
determine the constants A and B using this technique. As we have already seen, it 
is more appropriate to use the integral representation (11.14). 


We can show that the WKB method is not restricted to ordinary differential 
equations with terms in just y" and y by considering a further example. 


Example 2: A boundary layer 

Let’s try to find a uniformly valid approximation to the solution of the two-point 
boundary value problem 

ey" + p{x)y' + q(x)y = 0 subject to 2/(0) = a, y( 1) = (3 (12.109) 

when tCl and p(x) > 0. If we assume a WKB expansion, 

y = exp + ^{x) + O(e) j , 

and substitute into (12.109), we obtain at 0(l/e) 

^'o{ip' 0 +p(x)} = 0 1 ( 12 . 110 ) 


and at 0(1), 


2 ^oV ,, i + i> o + p(a:)V ,, i + q(x) = 0. (12.111) 


Using the solution ip' 0 = 0 of (12.110) and substituting into (12.111) gives V’i = 
—q(x)/p(x), which generates a solution of the form 


V i = exp 



rm 

Jo P(t) 


Or (e) exp 



(12.112) 


The second solution of (12.110) has i/j' 0 = —p(x) and hence 

tpo = ~ j p(t)dt + c 2 . 


Equation (12.111) then gives 


<l(t) 

P(t ) 


dt. 


i/ji = — log p(x) + 


0 
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The second independent solution therefore takes the form 


y 2 = exp ■ 


C2- Jo p(t)dt 


- logp(l) + i W) dt ' 


C 2 

p(x) 


exp/-- [ p(t)dt + [ dt\ . (12.113) 

l e Jo Jo Pit) J 


Combining (12.112) and (12.113) then gives the general asymptotic solution as 


V = C\ exp < — 


gO) 1 


Co 


- exp 


-/ P(t)dt+ f ^ dt\ . (12.114) 

e Jo Jo PW 


I o P(i) J P(x) 

We can now apply the boundary conditions in (12.109) to obtain 


^-Ci + ZTTb’ P — C-i exp \ — f ;y4j + -77T 


P( 0) 


Jo P(t) J P{ 1) 


ex P \ 


p{t) dt+[ dt 
Jo Pv) 


The term exp { — y fo Pit) dt j is uniformly small and can be neglected, so that the 
asymptotic solution can be written as 

rl Q{t) 


P( 0) 
p{x) 


a — (3 exp 


y ~ (3 exp 
rl Q{t) 


Pit) 


dt 


/ o 


Pit) 


dt 


ex P s - 


p{t) dt 


'' JjJ 

Pit) 


dt 


Finally, the last exponential in this solution is negligibly small unless x = 0(e) (the 
boundary layer), so we can write 


y ~ /3exp 


r pm 

U pj) 


dt. 


Pi 0) 
pix) [ 


a — (3 exp •( / J^JtL dt 


Pit) 


exp < - 


p{ 0)x 


This is precisely the composite expansion that we would have obtained if we had 
used the method of matched asymptotic expansions instead. 

Connection Problems 

Let’s now consider the boundary value problem 


e 2 y"i x) + cj)(x)y(x ) = 0 subject to y( 0) = 1, y — > 0 as x — > — oo, 

(12.115) 

with 



4>{x) > 0 

for x > 0, 


</>( x) ~ (j)\X 

for \x\ < 1, 4>i > 0, 

(12.116) 

<j>ix ) < 0 

for x < 0. 


To prevent nonuniformities as \x\ — > 

oo, we will also insist that |<^(a:)| 

x~ 2 for 


V = 


-4(e) 

(/) 1 / 4 (x) 


exp < i- 


Jo 4> 1/2 it)dt 
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B(e) 


-l- 


Jo J / 2 (t) dt 


+ 


y = C(e) exp 



^ log | </>(*) | 


for x > 0, (12.117) 

• | for x < 0. (12.118) 


The problem of determining how A and B depend upon C is known as a connection 
problem, and can be solved by considering an inner solution in the neighbourhood 
of the origin. 

For |ar| Cl,^~ 4> iX, so we can estimate the sizes of the terms in the WKB 
expansion. For x < 0 


y ~ C(e) exp 


J (-<M) 1/2 dt - ^ logi-fax) 


~ C(e)exp|-^^ /2 (-a;) 3/2 - ^log^i- ^log(-a:)|. 

We can now see that the second term becomes comparable to the first when — x = 
0(e 2 / 3 ). A similar estimate of the solution for x > 0 also gives a nonuniformity 
when x = 0(e 2 / 3 ). The WKB solutions will therefore be valid in two outer regions 
with |x| e 2 / 3 . We will need a small inner region, centred on the origin, and the 

inner solution must match with the outer solutions. Equation (12.115) shows that 
the only rescaling possible near to the origin is in a region where x = 0(e 2 / 3 ), and 
x = x/e 2 / 3 = 0(1) for e <C 1. Writing (12.117) and (12.118) in terms of x leads to 
the matching conditions 


C(e) 


<^ /4 M) 1/4 ei /6 


exp 


-jM) 


^ l 3 / 2 ^/ 2 


as x 


— 00 , 


(12.119) 


A(e) 


, ■ exp < i-x 3 ^ 2 (Jf 2 

^ /4 *l/4 C l/6 13 


B(e) 


0j //4 x 1 / 4 e 1 / 6 


exp 


-i\i 3 , 2 A 2 


as x 


(12.120) 


If we now rewrite (12.115) in the inner region, making use of ~ e 2 ^ 3 (f> ±x at leading 
order, we arrive at 

d 2 V , -- n 
— +(j)ixy = 0, 

subject to y(0) = 1, and the matching conditions (12.119) and (12.120). 

(12.121) 

1 /3 

We can write (12.121)f in terms of a standard equation by defining t = ~4>i x , in 
terms of which (12.121) becomes 


d 2 y 

dt 2 


= ty. 


f Note that equation (12.121) is valid for \x\ <C 1, and we would expect its solution to be valid 
in the same domain. Hence there is an overlap of the domains for which the inner and outer 
solutions are valid, namely e 2 / 3 <C |x| <C 1, and we can expect the asymptotic matching 
process to be successful. In fact the overlap domain can be refined to e 2 / 3 <C |®| e 2 / 5 (see 

Exercise 12.17). 
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This is Airy’s equation, which we met in Section 3.8, so the solution can be written 
as 

y = aAi(f) + 6Bi(f) = aAi (— + 6Bi (— t/q^ 3 x'j . (12.122) 

In Section 11.2.3 we determined the asymptotic behaviour of Ai(t) for \t\ 1 using 

the method of steepest descents. The same technique can be used for Bi(f), and we 
find that 

Ai W-^V^exp (~^ 3/2 )> 

f 3/2 ^ as t -> oo, (12.123) 

Ai(t)~-^H)- 1/4 sin{?(-t) 3 / 2 + |}, 

Bi(f) ~ — h(— t) _1 / 4 cos /-(— t) 3 / 2 + — 1 as t — » — oo. (12.124) 

A 1 3 4 J 

Using this known behaviour to determine the behaviour of the inner solution, 
(12.122), shows that 



Bi(t) ~ 7r 1 / 2 < 1,/4 exp ^ 


+&7r l ! 2 {^)\^x^ exp | ^ | as x — > — oo. 

In order to satisfy the matching condition (12.119), we must have b = 0, so that 
only the Airy function Ai appears in the solution, and 

(7(e) = i7r- 1/2 e 1/6 ^ /6 a. (12.125) 

The boundary condition y( 0) = 1 then gives a = 1/Ai(0) = T (|) 3 2 / 3 , and hence 
determines (7(e) through (12.125). 

As x — > oo, 





hrbV 1 * 3/2 



— exp 



}/ 2 r 3 / 2 
1 x 


We can therefore satisfy the matching condition (12.120) by taking 



A(e) = B*(e) 



2iy / 7T 


Since A and B are complex conjugate, the solution is real for x > 0, as of course 
it should be. This determines all of the unknown constants, and completes the 
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solution. The Airy function Ai(f) is shown in Figure 11.12, which clearly shows 
the transition from dispersive oscillatory behaviour as t —* — oo to dissipative, 
exponential decay as t — * oo. 


12.3 Partial Differential Equations 

Many of the asymptotic methods that we have met can also be applied to partial 
differential equations. As you might expect, the task is usually rather more difficult 
than we have found it to be for ordinary differential equations. We will proceed by 
considering four generic examples. 


Example 1: Asymptotic solutions of the Helmholtz equation 

As an example of an elliptic partial differential equation, let’s consider the solution 

of the Helmholtz equation, 

V 2 (f> + e 2 cj) = 0 for r ^ 1, (12.126) 


subject to <fi(l ,9) = sin 9 for e <C 1. This arises naturally as the equation that 
governs time-harmonic solutions of the wave equation, 


1 d 2 z 
c 2 dt 2 


V 2 2, 


which we met in Chapter 3. If we write z = e 8u;t (/>(x), we obtain (12.126), with 
e = u/c. Since (j> = 0(1) on the boundary, we expand (f> — (f > o + e 2 4>2 + 0(e 4 ), and 
obtain, at leading order, 


V 2 </>o = 0, subject to <j> o(l, 9) = sin 61 


If we seek a separable solution of the form cfo = f[r) sin 9, we obtain 

f" + ^ / = 0, subject to /( 1) = 1. 

This has solutions of the form / = Ar+ Br -1 , so the bounded solution that satisfies 
the boundary condition is f(r) = r, and hence <f> o = rsinfb At 0(e 2 ), (12.126) gives 


V 2 ^2 = — rsinfl, subject to <fo(l ,9) = 0. 


If we again seek a separable solution, <f >2 = F(r ) sin 9 , we arrive at 

F" + -F' — \f = — -, subject to F(l) = 0. 
r r 2 a 

Using the variation of parameters formula, this has the bounded solution 

<t >2 = - (r — r 3 ) sin 61 
8 

The two-term asymptotic expansion of the solution can therefore be written as 

<f)= rsin0 + ^e 2 (r — r 3 ) sin# + 0(e 4 ), 

8 

which is bounded and uniformly valid throughout the circle, r ^ 1. 
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Let’s also consider a boundary value problem for the modified Helmholtz 
equation, 

— <f> = 0 subject to </>(l, 9) = 1 and <f> — > 0 as r — > oo. (12.127) 

Note that </> = 0 satisfies both the partial differential equation and the far field 
boundary condition, but not the boundary condition at r = 1. This suggests that 
we need a boundary layer near r = 1. If we define r = 1 + ef with f = 0(1) in the 
boundary layer for e < 1, we obtain 

4>rr 0 — 0: 

at leading order. This has solution 

$ = A(ey + Biey-r , 

which will match with the far field solution if A = 0, and satisfy the boundary 
condition at r = 0 if B = 1. The inner solution, and also a composite solution valid 
at leading order for all r ^ a, is therefore 

<f> = exp 



Example 2: The small and large time solutions of a diffusion problem 


Consider the initial value problem for the diffusion equation, 
dc d 2 c 

— = -Dw-w for —oo < x < oo and t > 0, 
at ax z 

to be solved subject to the initial condition 

fo{x) for x ^ 0, 

0 for x > 0, 

with / 0 € C 2 (R), f 0 — > 0 as x — * — oo, / o (0) ^ 0 and 

/ fo(x)dx = ftot- 

J — OO 


c(x, 0) = 


(12.128) 


(12.129) 


(12.130) 


We could solve this using either Laplace or Fourier transforms . The result, however, 
would be in the form of a convolution integral, which does not shed much light on 
the structure of the solution. We can gain a lot of insight by asking how the solution 
behaves just after the initial state begins to diffuse (the small time solution, f<Cl), 
and after a long time (t 1). 


The small time solution, f < 1 

The general effect of diffusion is to smooth out gradients in the function c(x,t). 
It can be helpful to think of c as a distribution of heat or a chemical concentration. 
This smoothing is particularly pronounced at points where c is initially discontinu- 
ous, in this case at x = 0 only. Diffusion will also spread the initial data into x > 0, 
where c = 0 initially. For this reason, we anticipate that there will be three distinct 
asymptotic regions. 
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- Region I: x < 0. In this region we expect a gradual smoothing out of the initial 
data. 

- Region II: \x\ <C 1. The major feature in this region will be an instantaneous 
smoothing out of the initial discontinuity. 

- Region III: x > 0. There will be a flux from x < 0 into this region, so we expect 
an immediate change from c = 0 to c nonzero. 

Thinking about the physics of the problem before doing any detailed calculation 
is usually vital to unlocking the structure of the asymptotic solution of a partial 
differential equation. 

Let’s begin our analysis in region I by posing an asymptotic expansion valid for 
t < 1, 

c{x,t) = /oO) + tfi(x) + t 2 f 2 {x) + 0{t 3 ). 

Substituting this into (12.128) gives 

fi = Dfd, 2/ 2 = Df[’, 

and hence 

C(x, t ) = fo(x) + tDfg(x) + l -t 2 D 2 f”'\x) + 0(t 3 ). (12.131) 

Note that c increases in regions where /q > 0, and vice versa, as physical intuition 
would lead us to expect. Note also that as x — > 0“, c(x,t) ~ /o(0). However, in 
i>0we would expect c to be small, and the solutions in regions I and III will not 
match together without a boundary layer centred on the origin, namely region II. 

Before setting up this boundary layer, it is convenient to find the solution in 
region III, where we have noted that c is small. If we try a WKB expansion of the 
formf 

c(x,t) = exp + B{ x ) logf + C(x) + o(l)| , 

and substitute into (12.128), we obtain 

A = DA 2 , A X B X = 0, B = —D (A xx + 2 A X C X ) . 

The solutions of these equations are 

A = + (3x + D/3 2 , B = b, C = - ^ log (2/3D + x) + d, 

where /3, b and d are constants of integration. The WKB solution in region III is 
therefore 

b+ 7 ^J l°g (2/LD + x) + d + o(l) 

(12.132) 

f Note that we are using the small time, t , in the WKB expansion that we developed in Sec- 
tion 12.2.7. 


c = exp <( - j I A-p + (3x 4- D/3 2 ) + b log t - 
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In the boundary layer, region II, we define a new variable, 77 = x/t a . In region 
II, |? 7 | = 0(1), and hence \x\ = 0(t a ) <C 1, with a > 0 to be determined. In terms 
of 77 , (12.128) becomes 

dc a dc D d 2 c 
dt dii t 2a 9r] 2 

Since c = 0(1) in the boundary layer (remember, c must match with the solution 
in region I as 77 — > — 00 , where c = 0 ( 1 )), we can balance terms in this equation 
to find a distinguished limit when a = 1/2. The boundary layer therefore has 
thickness of Op 1 / 2 ), which is a typical diffusive length scale. If we now expand as 
c = Co ( 77 ) + o(l), we have, at leading order, 

2 7 1 Co i ~ DcQrjj). (12.133) 

If we now write the solutions in regions I and III, given by (12.131) and (12.132), 
in terms of 77 , we arrive at the matching conditions 

co ~ /o(0) as 77 -» - 00 , (12.134) 


cq ~ exp 


Dp 2 

t 


Pv 

t l/2 


4 D 


+ b log t— 


^6 + ^ log (%/3D + t 1 / 2 !^ + d + o(l) | as 77 — > 00 . (12.135) 

The solution of (12.133) is 

c 0 (? 7 ) = F + G f e~ s2/AD ds. 

J — OO 

As 77 — » — 00 , c ~ F = /o(0). As 77 — » 00 , we can use integration by parts to show 
that 

c 0 = / o (0) + G |y°° e~ s2 / iD ds - ^ e -" 2 / 4C + O ( } • 

In order that this is consistent with the matching condition (12.135), we need 

/ OO 

e -s 2 /4P dg = Q) 

-OO 

and hence G = — /o( 0 )/\/ 47 rZ). This leaves us with 

c 0 ~ exp | - + log (-2 DG) - log 77 j . 

For this to be consistent with (12.135) we need /? = 0, b = 1/2 and d = log (—2 DG). 

The structure of the solution that we have constructed allows us to be rather 
more precise about how diffusion affects the initial data. For |x| t 1 ^ 2 , x < 0 there 

is a slow smoothing of the initial data that involves algebraic powers of t, given by 
(12.131). For \x\ > t 1 / 2 , x > 0 , c is exponentially small, driven by a diffusive flux 
across the boundary layer. For \x\ = 0(i 1//2 ) there is a boundary layer, with the 
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solution changing by 0(1) over a small length of 0(f 1//2 ). This small time solution 
continues to evolve, and, when t = 0 ( 1 ), is not calculable by asymptotic methods. 
When t is sufficiently large, a new asymptotic structure emerges, which we shall 
consider next. 


The large time solution, t^> 1 

After a long time, diffusion will have spread out the initial data in a more or 
less uniform manner, and the structure of the solution is rather different from that 
which we discussed above for i « 1. We will start our asymptotic development 
where x = 0 ( 1 ), and seek a solution of the form 

c(x,t) = co(i) + ci(x, t) H , 

with | ci | <C | Co | for t 1 to ensure that the expansion is asymptotic. If we 
substitute this into (12.128), we obtain 

f-o(f) — F)C\ X xi 


at leading order, which can be integrated to give 


ci(x,t) = C ^~x 2 


+ af(t)x + Pf(t). 


(12.136) 


The distinction between ay and aj~, and similarly for /?*, is to account for dif- 
ferences in the solution for x > 0 and x < 0, introduced by the linear terms. As 
| a; | — » oo, ci grows quadratically, which causes a nonuniformity in the expansion, 
specifically when x = O ^i/c 0 (t)/|c 0 (f)|^ . In order to deal with this, we introduce a 

scaled variable, 77 = x/ y / c 0 (t)/|co(f)|, with 77 = 0 ( 1 ) for f ^ 1 in this outer region. 
In order to match with the solution in the inner region, where x = 0(1), we need 


In terms of 77 , (12.128) becomes 


co(i) as 77 0. 


dc 

dt. 


_ 1 ?? |cp(t)| d f cp (t) dc = |c 0 (f)| 


D- 


d 2 c 


co(t) dr] 


(12.137) 


(12.138) 


c 0 (t) dt V|co(i)|/ dr] 

Motivated by the matching condition (12.137), we will try to solve this using the 
expansion c = Co(t)F±(r]) + o(co(t )), subject to F± — > 1 as 77 — > 0* and F± — > 0 
as 77 ±00. The superscript ± indicates whether the solution is for 77 > 0 or 

77 < 0. It is straightforward to substitute this into (12.138), but this leads to some 
options. The first and third terms are of O(c 0 (f)), whilst the second term is of 
O(co(t)| (iTTflJi) • Should we include the second term in the leading order balance 
or not? Let’s see what happens if we do decide to balance these terms to get the 
richest limit. We must then have co(t)/co(t) = 0 (t), and hence Co = Ct~ a for some 
constants C and a. This looks like a sensible gauge function. 

If we proceed, (12.138) becomes, at leading order, 


F± - -r,F' ± = DF 


(12.139) 
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This is slightly more difficult to solve than (12.133). However, if we look for a 
quadratic solution, we quickly find that F± = rj 2 + 2D is a solution. Using the 
method of reduction of order, the general solution of (12.139) is 

f !" n e~ s2 / 2D 1 

± l ds ) . (12.140) 

As ?7 — > 0, by Taylor expanding the integrand, we can show that 

F± = (v 2 + 2D) (A* + B ± ^)+ 0(7 f). 

To match with the inner solution, we therefore require that A ± = 1/2 D. As 
ry — > ±oo, using integration by parts, we find that 


F± = (rj 2 + 2D) jA± ± B± j 
so that we require 

A ± ±B ± 


oo e ~s 2 /2D 

o (s 2 + 2D) 2 


* +0 (?*"’'“)} 


f°° g-s /2 D 

J 0 (s 2 + 2D) 2 

The outer solution can therefore be written as 

„2 


F + = 1 


V 

2D 


1± 


fV e -s 2 /2D 


ds 


ds = 0. 


oo g— s 2 /2D 


0 (s 2 ± 2D) 2 / Jo (s 2 + 2D) 2 


ds > . (12.141) 


In order to determine a, and hence the size of Co(t), some further work is needed. 
Firstly, we can integrate (12.128) and apply the initial condition, to obtain 

/ OO />0 

c(x, t)dx= / fo (x) dx = ftot . (12.142) 

-oo J — OO 

This just says that mass is conserved during the diffusion process. Secondly, we 
can write down the composite expansion 


C — dinner H - ^outer (dinner /outer — 

and use this in (12.142) to obtain 


co(t)F±(r]), 



co(t)F±(r]) dx = c 0 (t) 




F±(rf) dii = /tot- 


This is now a differential equation for cq in the form 

Cp /2 (t) = /tot 

\co(t)\F 2 F±(rj) dr) 

In Exercise 12.18 we find that fJ° 00 F±(r))dr) = \J2 ttD and hence that Co(t) = 
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ftot/y/AirDt. We can now calculate that y / co(t) /|c 0 (f)| = O^t 1 / 2 ), which is the 
usual diffusive length scale. Our asymptotic solution therefore takes the form 

/to 4= for |x| = Oit 1 / 2 ), 

(12.143) 

for \x\ t 1 / 2 . 


c{x,t) 


\J AnDt. 

/tot -F±{v) 


y/AirDt 


The success of this approach justifies our decision to choose co(i) in order to obtain 
the richest distinguished limit. Notice that the large time solution has “forgotten” 
the precise details of the initial conditions. It only “remembers” the area under the 
initial data, at leading order. 

If we consider the particular case fo(t) = e x , we find (see Exercise 12.18) that 
an exact solution is available, namely c(x,t ) = |e x+Dt erfc j== + y/Thtj. This 
solution is plotted in Figure 12.16 at various times, and we can clearly see the 
structures that our asymptotic solutions predict emerging for both small and large 
times. In Figure 12.17 we plot c(0,£) as a function of Dt. Our asymptotic solution 
predicts that c(0, t) = \ + o(l) for t <C 1, consistent with Figure 12.17(a). In 
Figure 12.17(b) we can see that the asymptotic solution, c(0, t) ~ 1/V AnDt as 
t — ■> oo, is in excellent agreement with the exact solution. 


Example 3: The wave equation with weak damping 
(i) Linear damping 


Consider the equation 


(Py = 2 d 2 y ^ Oy 
dt 2 C dx 2 6 dt ’ 
subject to the initial conditions 

y(x, 0) 


for t > 0 and — oo < x < oo, 


= Y 0 (x), 


dy 

dt 


( x , 0) = 0 , 


(12.144) 

(12.145) 


with fCl. The one-dimensional wave equation, (12.144) with e = 0, governs the 
small amplitude motion of an elastic string, which we met in Section 3.9.1. The 
additional term, eyt, represents a weak, linear damping, proportional to the velocity 
of the string, for example due to drag on the string as it moves through the air. 

The form of the initial conditions suggests that we should consider an asymptotic 
expansion y = y 0 + eyi + 0(e 2 ). On substituting this into (12.144) and (12.145) we 
obtain 


d 2 y 0 

dt 2 


> d 2 yo 

dx 2 


= 0, subject to yo(x,0) = Y 0 (x), y ot (x,0) = 0, (12.146) 


~ = subject to yi (x, 0) = yu(x,0) = 0. (12.147) 

The initial value problem given by (12.146) is one that we studied in Section 3.9.1, 
and has d’Alembert’s solution, (3.43), 

Do(x, t) = ^ {IqOe - ct ) + Y 0 (x + ct)} . 


(12.148) 
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Dt = 0, 0.01, 0.05, 0.1, 0.5 



Fig. 12.16. The solution of the diffusion equation with fo(x) = e x at various times. 


This solution represents the sum of two waves, one travelling to the left and one 
travelling to the right, each with speed c, without change of form, and with half 
the initial amplitude. This is illustrated in Figure 12.18 for the initial condition 
Yq{x) = 1/(1 + x 2 ). The splitting of the initial profile into left- and right-travelling 
waves is clearly visible. 

In terms of the characteristic variables, £ = x — ct, r] = x + ct, (12.147) becomes 


lc 2 d2 ' Vl - c ( dyo dy ° ) 
d^drj V d£ drj ) 




Integrating this expression twice gives the solution 

VI = ^ UW - ^0(0} + iMO + Gift). (12.149) 

The initial conditions show that 


Fi(x) + Gi(x) — 0, 


F[(x) - Gi(:r) 


1 

4 c 


{xYq(x) -F 0 Or)}, 


and 
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Fig. 12.17. The solution of the diffusion equation with fo(x) = e x at x = 0. 


which can be integrated once to give 

Fi(x) - Gi(x) = |xlo(a:) - 2 J F 0 («) ds j + b. 

Finally, 

Fi(x) = —Gi(x) = ^ |xlo(a:) - % J ^oO) ds j + ^b, 

which, in conjunction with (12.149), shows that 

1 1 r x + ct 

2/i = --t{Y 0 (x + ct) + Y 0 (x- ct)} + — Y 0 (s)ds. (12.150) 

4 4 c J x -ct 

We can now see that y\ = 0{t) for t 1, and therefore that our asymptotic 
expansion becomes nonuniform when t = 0(e _1 ). 

We will proceed using the method of multiple scales, defining a slow time scale 
T = ef, and looking for a solution y = y(x, t , T). In terms of these new independent 
variables, (12.144) becomes 

Vtt + 2 cytT + e 2 y T T = c 2 y xx - ey t - e 2 y T - 


(12.151) 
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Fig. 12.18. The solution of (12.144) when c = 1 and Yo(x) = 1/(1 + x 2 ) at equal time 
intervals, t = 0, 4, 8, 12, 16, 20, when e = 0 and 0.1. 


If we now seek an asymptotic solution of the form y = yo(x,t,T) + eyi(x,t,T) + 
0(e 2 ), at leading order we obtain (12.146), as before, but now the solution is 


yo(x,t,T) = Fo^,T) + Go( V ,T), (12.152) 

with 

*b(£,o) = iy 0 (0, G 0 {v,o) = ^Y 0 {n). (12.153) 

As usual in the method of multiple scales, we need to go to O(e) to determine F 0 
and Gq. We find that 


= F oi -Gor, + 2 (F 0?t - G 0v t) ■ (12.154) 

On solving this equation, the presence of the terms of the right hand side causes y\ 
to grow linearly with t. In order to eliminate them, we must have 



Fo (T — ~ 2 ^°£’ Go v t — 


(12.155) 
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If we solve these equations subject to the initial conditions (12.153), we obtain 
F 0 = ^e- T Y 0 ((), Go = \e- T Yo(i /), 

and hence 

2/o = \ e ~ et/ 2 { Y o( x ~ ct ) + * 0 ( 2 : + ct)} . (12.156) 

This shows that the small term, ey t , in (12.144) leads to an exponential decay of 
the amplitude of the solution over the slow time scale, t = 0(e~ 1 ), consistent with 
our interpretation of this as a damping term. Figure 12.18 shows how this slow 
exponential decay affects the solution. 


(ii) Nonlinear damping 

What happens if we replace the linear damping term eyt with a nonlinear damping 
term, e(|/t) 3 ? We must then solve 

\ 3 

1 , for t > 0 and —00 < x < 00 , (12.157) 

subject to the initial conditions 

dv 

y(x, 0) = Y 0 (x), ~S( V * * * * X , 0) = 0. (12.158) 

at 

We would again expect a nonuniformity when t = 0(e _1 ), so let’s go straight to a 
multiple scales expansion, y = yo{x, t. T) + eyi(x, t, T). At leading order, as before, 
we have (12.152) and (12.153). At 0{e), 

-4cyi C „ = c 2 (F 0i - G 0v ) 3 + 2 (F 0 £t - G 0v t) ■ (12.159) 


d 2 y = 2 (Py_ _ (dy 
dt 2 C dx 2 6 \dt 


In order to see clearly which terms are secular, we integrate the expression (F 0 ^ — 
Gon) 3 twice to obtain 

V [ F ^(s) ds - 3G 0 (?/) [ F^(s) ds + 3F 0 {0 f G^(s) ds - ^ [ G 3 0ri (s)ds. 

Jo Jo Jo Jo 

Assuming that F 0 (s) and Go(s) are integrable as s — > ± 00 , we can see that the 

terms that become unbounded as £ and 77 become large are those associated with 

and G q . We conclude that, to eliminate secular terms in (12.159), we need 

F o HT = --c 2 F^, Got/T = ~^c 2 Gl v , 
to be solved subject to (12.153). The solutions are 


F oe. = 




= > Gon — 


Y{(v) 


and hence 

Vo = - 


/ ) '“■U7J — / ) 

y 4 + c 2 T {Fg(^)} 2 ^4 + c 2 T{Y'( V )} 2 

y 6(s) :ds , r w 


4 + c 2 T {F 0 '(s)} 


ds 


v JA + c 2 T{Y'(s)}- 


(12.160) 
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In general, these integrals cannot be determined analytically, but we can see that 
the amplitudes of the waves do decay as T increases, and are of 0(T -1 / 2 ) for T ^$> 1. 

Example f: The measurement of oil fractions using local electrical probes 

As we remarked at the beginning of the book, many problems that arise in engi- 
neering are susceptible to mathematical modelling. We can break the modelling 
process down into separate steps. 

(i) Identify the important physical processes that are involved. 

(ii) Write down the governing equations and boundary conditions. 

(iii) Define dimensionless variables and identify dimensionless constants. 

(iv) Solve the governing equations using either a numerical method or an asymp- 
totic method. 

Note that, although it is possible that we can find an analytical solution, this is 
highly unlikely when studying real world problems. As we discussed at the start 
of Chapter 11, when one or more of the dimensionless parameters is small, we can 
use an asymptotic solution technique. Let’s now discuss an example of this type of 
situation. 

For obvious reasons, oil companies are interested in how much oil is coming out 
of their oilwells, and often want to make this measurement at the point where oil 
is entering the well as droplets, rather than at the surface. One tool that can be 
lowered into a producing oilwell to assist with this task is a local probe. This is 
a device with a tip that senses whether it is in oil or water. The output from the 
probe can be time-averaged to give the local oil fraction at the tip, and an array 
of probes deployed to give a measurement of how the oil fraction varies across the 
well. We will consider a simple device that distinguishes between oil and water by 
measuring electrical conductivity, which is several orders of magnitude higher in 
saline water than in oil. 

The geometry of the electrical probe, which is made from sharpening the tip 
of a coaxial cable like a pencil, is shown in Figure 12.19. A voltage is applied to 
the core of the probe, whilst the outer layer, or cladding, is earthed. A measure- 
ment of the current between the core and the cladding is then made to determine 
the conductivity of the surrounding medium. Although this measurement gives a 
straightforward way of distinguishing between oil and water when only one liquid 
is present, for example when dipping the probe into a beaker containing a single 
liquid, the difficulty lies in interpreting the change in conductivity as a droplet of 
oil approaches, meets, deforms around and is penetrated by the probe. If we want 
to understand and model this process, there is clearly a difficult fluid mechanical 
problem to be solved before we can begin to relate the configuration of the oil 
droplet to the current through the probe (see Billingham and King, 1995). We will 
pre-empt all of this fluid mechanical complexity by considering what happens if, in 
the course of the interaction of an oil droplet with a probe, a thin layer of oil forms 
on the surface of the probe. How thin must this oil layer become before the current 
through the probe is effectively equal to that of a probe in pure water? 
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Fig. 12.19. A cross-section through an axisymmetric electrical probe. 


In order to answer this question, we must solve a problem in electrostatics, 
since the speed at which oil-water interfaces move is much less than the speed at 
which electromagnetic disturbances travel (the speed of light) f . The electrostatic 
potential, </), is an axisymmetric solution of Laplace’s equation, 

V 2 </> = 0. (12.161) 


We will assume that the conducting parts of the probe are perfect conductors, 
so that 


J 1 at the surface of the core, 

[ 0 at the earthed surface of the cladding. 


(12.162) 


At interfaces between different media, for example oil and water or oil and insulator, 
we have the jump conditions 


[</>] = o, 


d<j) 

r 

dn 


= 0 . 


(12.163) 


Square brackets indicate the change in the enclosed quantity across an interface, a is 
the conductivity, which is different in each medium (oil, water and insulator), and 
d/dn is the derivative in the direction normal to the interface. Equation (12.163) 
represents continuity of potential and continuity of current at an interface. To 
complete the problem, we have the far field conditions that 


<j) — > 0 as r 2 + z 2 — > oo outside the probe, 


(12.164) 


and 


</> ~ </>oo(r) as 2 — » oo for r 0 < r < ry, (12.165) 

using cylindrical polar coordinates coaxial with the probe, and r = 0 at the tip. Here 
ro and ry are the inner and outer radii of the insulator, as shown in Figure 12.19. 

f This, in itself, is an asymptotic approximation that can be made rigorous by defining a small 
parameter, the ratio of a typical fluid speed to the speed of light. Some approximations are, 
however, so obvious that justifying them rigorously is a little too pedantic. For a simple 
introduction to electromagnetism, see Billingham and King (2001). 
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The far field potential must satisfy 

Y 7 2 I / \ d ^OO 1 d(f ) oo 

v 0oo (r) = 2 H y— = 0, for r 0 < r < n, 

ar z r dr 

subject to 


4> = 1 at r = ro, (j> = 0 at r = ri. 


This has solution 

_ log(ri/r) 
°° log(n/r 0 )' 


(12.166) 


Finally, we will assume that the probe is surrounded by water, except for a uniform 
layer of oil on its surface of thickness ft < rj. Our task is to solve the boundary 
value problem given by (12.161) to (12.165). 

This is an example of a problem where the governing equation and boundary 
conditions are fairly straightforward, but the geometry is complicated. Problems 
like this are usually best solved numerically. However, in this case we have one 
region where the aspect ratio is small - the thin oil film. The potential is likely 
to change rapidly across this film compared with its variation elsewhere, and a 
numerical method will have difficulty handling this. We can, however, make some 
progress by looking for an asymptotic solution in the thin film. The first thing 
to do is to set up a local coordinate system in the oil film. The quantities h and 
Tq are the natural length scales with which to measure displacements across and 
along the film, so we let 77 measure displacement across the film, with 77 = 0 at 
the surface of the probe and rj = 1 at the surface of the water, and let £ measure 
displacement along the film, with £ = 0 at the probe tip and ( = 1 a distance 
ro from the tip. Away from the tip and the edge of the probe, which we will not 
consider for the moment, this provides us with an orthogonal coordinate system, 
and (12.161) becomes 


where 


At leading order, d 2 (f>/di] 2 


d 2 (j> 2 d 2 <j> 

dij 2 d£ 2 


= 0 , 


6 = — < 1 . 

r 0 

0, and hence <f> varies linearly across the film, with 


<f> = A(£)r 1 + B(0. 


(12.167) 


Turning our attention now to (12.163)2, since we expect variations of <f> in the water 
and the insulator to take place over the geometrical length scale r 0, we have 

= 8^- at interfaces, (12.168) 

dr] dn 


8j = 


CD 


where 
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with the subscripts o, w and i indicating oil, water and insulator respectively. 
We expect that < 1 and 6{ = 0(1), since oil and the insulator have similar 
conductivities, both much less than that of saline water. We conclude that, at the 
interface between oil and insulator, at leading order d(/)/di] = 0 , and hence A(£) = 0 
there. Also, from the conditions at the surface of the cladding and core, (12.162), 
,B(£) =0 at the cladding and B(£) = 1 at the core. 

Returning now to (12.168) with j = w, note that we have two small parameters, 
6 and S w . Double limiting processes like this (6 — > 0 , 6 W —> 0) have to be treated 
with care, as the final result usually depends on how fast one parameter tends 
to zero compared with the other. In this case, we obtain the richest asymptotic 
balance by assuming that 

^ = K = h(Tw = 0(1) as S -> 0, 

0 W 


and 




We can now combine all of the information that we have, to show that at the 
surface of the probe, the potential in the water satisfies 


d(j) 

dn 


K (</) — 1) at the surface of the core, 

0 at the surface of the insulator, 

K<p at the surface of the cladding. 


(12.169) 


The fact that the oil film is thin allows us to apply these conditions at the surface 
of the probe at leading order. The key point is that this asymptotic analysis allows 
us to eliminate the thin film from the geometry of the problem at leading order, 
and instead include its effect in the boundary conditions (12.169). The solution of 
(12.161) subject to (12.164), (12.165) and (12.169) in the region outside the probe 
is geometrically simple, and easily achieved using a computer. We will not show 
how to do this here, as it is outside the scope of this book. We can, however, extract 
one vital piece of information from our analysis. We have proceeded on the basis 
that K = 0(1). What happens if K 1 or K <C 1? If K 1, at leading order 
(12.169) becomes 

_ J 1 at the surface of the core, 

[ 0 at the surface of the cladding, 


d (j) 
dn 


0 at the surface of the insulator. 


(12.170) 


These are precisely the boundary conditions that would apply at leading order in 
the absence of an oil layer. We conclude that if K » 1, and hence h <C rocr 0 /cr w , 
the film of oil is too thin to prevent a current from passing from core to cladding 
through the water, and the oil cannot be detected by the probe. If A" <C 1, at 
leading order (12.169) becomes d(j)/dn = 0 at the surface of the probe, and hence 
(j> = 0 in the water. This then shows that 0=1 — 77 in the oil film over the core 
and 0 = 0 in the rest of the film, from which it is straightforward to calculate 
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the current flowing from core to cladding. In this case the oil film dominates the 
response of the probe, effectively insulating it from the water outside. 

We conclude that there is a critical oil film thickness, h c = roa Q /a w . For h <C h c 
the oil film is effectively invisible, for h h c the external fluid is effectively invisible, 
and the current through the probe is determined by the thickness of the film, whilst 
for h = 0(h c ) both the oil and water affect the current through the boundary 
conditions (12.169). For a typical oil and saline water, h c « 1CV 9 m. This is such a 
small length that, in practice, any thin oil film coating a probe insulates it from the 
external fluid, and can lead to practical difficulties with this technique. In reality, 
local probes are used with alternating rather than direct current driving the core. 
One helpful effect of this is to increase the value of h c , due to the way that the 
impedances) of oil and water change with the frequency of the driving potential. 


Exercises 

12.1 Determine the first two terms in the asymptotic expansion for 0 < e « 1 
of all the roots of each of the equations 

(a) x 3 + ex 2 — x + e = 0, 

(b) ex 3 + x 2 — 1 = 0, 

(c) ex 4 + (1 — 3e)a: 3 — (1 + 3e):r 2 — (1 + e)x + 1 = 0, 

(d) ex 4 + (1 — 3e)x 3 — (1 — 3e);r 2 — (1 + e)x +1 = 0. 

In each case, sketch the left hand side of the equation for e = 0 and Kl. 

12.2 The function y(x) satisfies the ordinary differential equation 

ey" + (4 + x 2 )(y' + 2 y) = 0, for 0 < x < 1, 

subject to y(0) = 0 and y( 1) = 1, with tCl. Show that a boundary layer is 
possible only at x = 0. Use the method of matched asymptotic expansions 
to determine two-term inner and outer expansions, which you should match 
using either Van Dyke’s matching principle, or an intermediate variable. 
Hence show that 

y'( 0) ~ — 4e 2 + 8e 2 tan -1 as e — > 0. 

Construct a composite expansion, valid up to O(e). 

12.3 Determine the leading order outer and inner approximations to the solution 
of 


ey " + x x ^ 2 y’ + y = 0 for 0 < x < 1, 



f the a.c. equivalents of the conductivities. 
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12.4 


12.5 


12.6 


12.7 


12.8 


The function y{x) satisfies the ordinary differential equation 
ey" + (1 + x)y' -y + 1 = 0, 

for 0 < x < 1, subject to the boundary conditions y( 0) = y(l) = 0, with 
fCl, where a prime denotes d/dx. Determine a two-term inner expansion 
and a one-term outer expansion. Match the expansions using either Van 
Dyke’s matching principle or an intermediate region. Hence show that 

y '(0) ~ + 1 as e °- 

Consider the boundary value problem 

e(2 y + y") + 2xy' — Ax 2 = 0 for —1 < x ^ 2, 

subject to 


V(- 1) = 2, y(2) = 7, 


with eCl. Show that it is not possible to have a boundary layer at either 
x = —1 or x = 2. Determine the rescaling needed for an interior layer at 
x = 0. Find the leading order outer solution away from this interior layer, 
and the leading order inner solution. Match these two solutions, and hence 
show that y(0) ~ 2 as e — > 0. Sketch the leading order solution. 

Now determine the outer solutions up to O(e). Show that a term of 
O(eloge) is required in the inner expansion. Match the two-term inner 
and outer expansions, and hence show that 2/(0) = 2 — |eloge + O(e) for 
f«l. 

Consider the ordinary differential equation 

ey" + yy' — y = 0 for 0 < x < 1, 


subject to 2/(0) = a, 2/(1) = (3, with a and (3 constants, and fCl. 

(a) Assuming that there is a boundary layer at x = 0, determine the 
leading order inner and outer solutions when a = 0 and (3 = 3. 

(b) Assuming that there is an interior layer at x = xo, determine the 
leading order inner and outer solutions, and hence show that Xq = 
1/2 when a = — 1 and (3 = 1. 

Use the method of multiple scales to determine the leading order solution, 
uniformly valid for t <C e - , of 


d 2 y 

dt 2 


+ y = ey 3 



subject to y = 1, dy/dt = 0 when t = 0, for e <C 1. 
Consider the ordinary differential equation 


y + ey + y + e 2 y cos 2 t = 0, 


for t > 0, subject to y(0) = 1, y(0) = 0, where a dot denotes d/dt. Use the 
method of multiple scales to determine a two-term asymptotic expansion, 
uniformly valid for all t <C e~ 3 when tCl. 
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12.9 Consider the ordinary differential equation 

y + ey + y + e 2 y 3 = 0, 

for t V 0, subject to y( 0) = 1, y(0) = 0, where a dot denotes d/dt. Use the 
method of multiple scales, with 

T=et, t = t + e 2 at, y = y(r,T), 

to show that 

y ~ e ~ T cos t for e <C 1. 


12.10 

12.11 


Show further that a = —1/8 and determine the next term in the asymptotic 
expansion. (You will need to consider the first three terms in the asymptotic 
expansion for y.) 

Show that when vq -C 1, (12.81) becomes dE/dT = —2 E at leading order. 
Consider the initial value problem 


subject to 


d 2 y 0 dy 3 

-j* - 2 e dir v+y = “• 


2 /( 0 ) = Vi > 1 , 


dy 

dt 


( 0 ) = 0 . 


(E12.1) 

(E12.2) 


Use a graphical argument to show that when e = 0, y is positive, provided 
that y t < \/2. Use Kuzmak’s method to determine the leading order ap- 
proximation to the solution when 0 < e <C 1 and 1 < yt < v/2. You should 
check that your solution is consistent with the linearized solution when 
yi — 1 <C 1. Hence show that y first becomes negative when t = to ~ To/e, 
where 


T 0 = 



3 K(l) 

4V1 + 2 E (VI + 2 E + 1) 


{(2k 2 -l)L(l)-2(ki-l)K(l)} dE ' 


k = k(E) 


yi + 2E + i \ 1/2 
2 VI + 2 E ) 


Hints: 

(a) The leading order solution can be written in terms of the Jacobian 
elliptic function cn (see Section 9.4). 

(b) 



— fc 2 + k 2 x 2 dx 
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12.12 Show that when D( x, x) = D 0 (x)D(x) and R(9{x), x, x) = Ro(9(x),x)R(x), 
the homogenized diffusion coefficient and reaction term given by (12.101) 
and (12.102) can be written in the simple form described in Section 12.2.6. 

12.13 Give a physical example that would give rise to the initial-boundary value 
problem 


89 

9t 



for 0 < x < 1, 


subject to 


9(x, 0) = 6i(x) for 0 ^ x ^ 1, 


d0 

— = 0 at x = 0 and x = 1 for t > 0. 
ox 

When 0 < e <C 1, use homogenization theory to show that, at leading 
order, 8 is a function of x and t only, and satisfies an equation of the form 


89 d 


F ( x )n: = nz D ( x )nz 


dt dx 


dd 


dx 


where D(x) is given by (12.101), and F(x) is a function that you should 
determine. If D(x,x/e) = D a (x)D(x/e), show that, at leading order, 8 
satisfies the diffusion equation 


dd d 


dt dx 


dd\ 


RI = HZ\ A>(z)w- 


dx J 


where 


t = 


t 


( 2 ' 2 J o V, 5fo* 


liin ff — ,o 

12.14 Use the WKB method to find the eigensolutions of the differential equation 

y"(x) + (A - x 2 )y{x) = 0, 


subject to y — » 0 as \x\ — ■> oo, when A » 1. 

12.15 Find the first two terms in the WKB approximation to the solution of the 
fourth order equation 

ey""(x) = {l-eV(x)}y(x) 

that satisfies V’(rtoo) = 0 and y(± oo) = 0, when eCl. 

12.16 Solve the connection problem 


e z y”(x) + cx 2 y{x) = 0, 


subject to 2/(0) = 1 and y — > 0 as x — » — oo, when e <C 1. 

12.17 By determining the next term in the WKB expansion, verify that the over- 
lap domain for the inner and outer solutions of the boundary value problem 
(12.115) is e 2 / 3 < |x| < e 2 / 5 (see the footnote just after (12.121)). 
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12.18 


12.19 


12.20 


12.21 


12.22 


12.23 


12.24 


Use Fourier transforms to solve (12.128) subject to (12.129) with fo(x) = 
e x , and hence show that c(x,t) = \e x+Dt evic(^= + Use this 

expression to show that c(0,i) ~ l/VUr Dt as t — > oo. By comparing 
this result with the large time asymptotic solution that we derived in Sec- 
tion 12.3, show that F±(v) dr) = \/2 ttD. 

Find the small time solution of the reaction-diffusion equation 

u t = Du xx + au, 


subject to the initial condition 


u{x, 0) 


fo(x) for |x| < a, 
0 for \x\ ^ a. 


What feature of the solution arises as a result of the reaction term, au 1 
Find the leading order solution of the partial differential equation 

Ct = {D + ex) 2 c xx for x > 0, t > 0, 


when e«l, subject to the initial condition c{x, 0) = f(x) and the bound- 
ary condition c(0, t) = 0. Your solution should remain valid for large x and 

t. 

Find a uniformly valid solution of the hyperbolic equation 

e(ut + u x ) + (t — 1 ) 2 u = 1 for — oo < x < oo, t > 0, 


when eCl, subject to the initial condition u(x, 0) = 0. 
Find the leading order asymptotic solution of 


e ( u xx + u yy ) + u x + (3u y = 0 for x > 0 , 0 < y < L , 

when e< 1, subject to the boundary conditions u(x,0) = f(x), u(0,y) = 
g{y) and u(x, L ) = 0. Your solution should be uniformly valid in the 
domain of solution, so you will need to resolve any boundary layers that 
are required. 

Find a uniformly valid leading order asymptotic solution of 

e(utt — c 2 u xx ) + ut + au x = 0 for — oo < x < oo, t > 0, 


when t<l, subject to the initial conditions u(x, 0) = f{x), u t .(x, 0) = 0. 
How does the solution for c > |a| differ from the solution when c < |aj? 

Project: The triple deck 

Consider the innocuous looking two-point boundary value problem 

ey" + x 3 y' + (x 3 - e)y = 0, subject to y(0) = \, y(l) = 1, (E12.3) 

with fCl. 

(a) Show that the outer solution is y = e x ~ x . 
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(b) Since the outer solution does not satisfy the boundary condition at 
x = 0, there must be a boundary layer. By writing x = x/e . show 
that there are two possibilities, y = 1 and y = 

As there is now a danger of confusion in our notation, we will define 
x = x/e , and refer to the region where x = 0(1) as the lower deck, 
and x* = x/e 1 ! 2 , and refer to the region where x* = 0(1) as the 
middle deck. We will refer to the region where x = 0(1) as the 
upper deck. This nomenclature comes from the study of boundary 
layers in high Reynolds number fluid flow. 

(c) Show that the leading order solution in the lower deck is y = \e ~ x , 

and in the middle deck y* = a exp j— 1/2 (x*) 2 |. 

(d) Apply the simplest form of matching condition to show that a = e, 
and hence that the solution in the middle deck is 

"" =exp { i -^M' 

(e) The matching between the lower and middle decks requires that 

lim y = lim y* , 

x — ►oo x* — ►() 

which is satisfied automatically without fixing any constants. Show 
that the composite solution takes the form 

(f) Integrate (E12.3) numerically using MATLAB, and compare the nu- 
merical solution with the asymptotic solution for e = 0.1, 0.01 and 
0.001. Can you see the structure of the triple deck? 

(g) The equations for steady, high Reynolds number flow of a viscous 
Newtonian fluid with velocity u and pressure p are 

V • u = 0, (u • Vu) = — Vp + -^V 2 u, 

Re 

where Re is the Reynolds number. If these are to be solved in 
— oo < x < oo, y > f(x) subject to u = 0 on y = /( x) and u ~ Ui 
as x 2 + y 2 — » oo, where would you expect a triple or double deck 
structure to appear? 

You can find some more references and background material in 
Sobey (2000). From the data in this book, estimate the maximum 
thickness of the boundary layer on a plate 1 m long, with water 
flowing past it at 10 ms -1 . 
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Stability, Instability and Bifurcations 


In this chapter, we will build upon the ideas that we introduced in Chapter 9, 
which was concerned with phase plane methods for nonlinear, autonomous, ordinary 
differential equations. We will study what happens when an equilibrium point has 
one or more eigenvalues with zero real part. These are nonhyperbolic equilibrium 
points, which we were unable to study by simple linearization in Chapter 9. Next, 
we will introduce the idea of Lyapunov functions, and show how they can be used 
to study the stability of nonhyperbolic equilibrium points. We will also consider 
differential equations that contain a parameter, and examine what can happen to 
the qualitative form of the solutions as the parameter varies. At parameter values 
where one or more equilibrium points is nonhyperbolic, a local bifurcation point, 
the qualitative nature of the solutions changes. Finally, we will look at an example 
of a global bifurcation. 


13.1 Zero Eigenvalues and the Centre Manifold Theorem 

Let’s consider the structure of an equilibrium point which has one zero eigenvalue 
and one negative eigenvalue. After shifting the equilibrium point to the origin, 
writing the system in terms of coordinates with axes in the directions of the eigen- 
vectors, and rescaling time so that the negative eigenvalue has unit magnitude, the 
system will have the form 

x=P(x,y), y = -y + Q{x,y), (13.1) 

where P and Q are nonlinear functions with P(0,0) = <5(0,0) = 0. The linear 
approximation to (13.1) is x = 0, y = —y, which has the solution x = 0, y = yoe - * 
that passes through the origin. This shows that points on the y-axis close to the 
origin approach the origin as t — > oo. We say that the local stable manifoldf in 
some neighbourhood, U, of the origin is 

w s (0, U) = {{x,y) | x = 0, (x,y)eU}. 

This is the set of points that are, locally, attracted to the origin in the direction 
of the eigenvector that corresponds to the negative eigenvalue. We are going to 
have to work harder to determine the behaviour of the other integral paths near 
the origin. 


f We will carefully define what we mean by a manifold in Section 13.1.2. 
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In order to work out what is going on, we define a local centre manifold for 
(13.1) to be a C k curve w c (0, U) in a neighbourhood U of the origin such that 

(i) co c (0,U) is an invariant set within U, so that if x(0) £ ix c (0,U), then 
x(f) £ u> c (0,U ) when x(f) £ U. In other words, a solution that initially 
lies on the centre manifold remains there when it lies in U. 

(ii) u) c (0,U) is the graph of a C k function that is tangent to the x-axis at the 
origin, so that 

w c (0,Z7) = j(a;, 2 /) | y = h{ x), h( 0) = ^(0) = 0, ( x,y ) £ . 

This gives us the picture of the local stable and centre manifolds shown in Fig- 
ure 13.1. The main idea now is that the qualitative behaviour of the solution in 



Fig. 13.1. The local stable and centre manifolds of the system (13.1). 

the neighbourhood of the origin, excluding the stable manifold, is determined by 
the behaviour of the solution on the centre manifold. This means that the local 
dynamics are governed by a first order differential equation. 

Theorem 13.1 (The centre manifold theorem) The equilibrium point at the 
origin of the system (13.1) is stable/unstable if and only if the equilibrium point at 
i = 0 o/ the first order differential equation 

x = P{x, h(x)), (13.2) 

where y = h(x) is the local centre manifold, is stable/unstable. Integral paths in the 
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neighbourhood of the local centre manifold are attracted onto the centre manifold 
exponentially fast. 

Proof This can be found in Wiggins (1990). □ 


13.1.1 Construction of the Centre Manifold 

Now that we know that a local centre manifold exists, how can we determine its 
equation, y = h{x)l Since y = xdh/dx, (13.2) shows that 

—P(x, h(x)) = —h(x) + Q(x, h(x)), (13.3) 

dx 

subject to ,, 

M°) = y-(0) = 0. (13.4) 

dx 

We will not usually be able to solve (13.3) analytically, but we can proceed to 
determine the local form of the centre manifold by assuming that h{x) can be 
represented as a power series, 

h(x) = a 2 X 2 + a 3 X 3 + ■ ■ ■ , (13.5) 

which automatically satisfies (13.4). 

As an example, let’s consider the system 

x = ax 3 + xy, y = —y + y 2 + x 2 y + x 3 (13.6) 

with a > 0. This is of the form (13.1), so we need to substitute (13.5) into (13.3). 
This shows that 

( 2 a 2 X + 3a3X 2 + ■ ■ ■ )(ax 3 + a 2 X 3 + a 3 X 4 + • • • ) 

= -a 2 x 2 - a 3 X 3 + ( a 2 x 2 + a 3 X 3 + • • • ) 2 + x 2 (a 2 x 2 + a 3 X 3 H ) + x 3 . 

Equating powers of x gives us a 2 = 0 at 0(x 2 ) and 03 = +1 at 0(x 3 ), so that 
the centre manifold is given by y = h(x) = +x 3 + • • • . On the centre manifold, 
we therefore have x = ax 3 — x 4 + ■ ■ ■ . For \x\ <C 1 we can ignore the quartic term 
and just consider x ss ax 3 , so that x > 0 for x > 0 and x < 0 for x < 0. Integral 
paths that begin on the local centre manifold therefore asymptote to the origin as 
t — > — 00 , and we conclude that the local phase portrait is of a nonlinear saddle, 
as shown in Figure 13.2. More specifically, if x = Xo on the centre manifold when 
t = 0 , x ~ xo/y/l — 2 aa’gf, so that x ~ xq/\J— 2 ax^t as t — > — 00 . This algebraic 
behaviour on the centre manifold is in contrast to the exponential behaviour that 
occurs on the unstable separatrix of a linear saddle point. 

Example: Travelling waves in cubic autocatalysis 
If two chemicals, which we label A and B, react through a mechanism known as 

cubic autocatalysis, we write 

A + 2B — y 3B, rate kab 2 , 


(13.7) 
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Fig. 13.2. The local phase portrait for the system (13.6). 


where k is the reaction rate constant and a and b are the concentrations of the 
two chemicals, which are measured in moles m -3 f. The chemical B is known as the 
autocatalyst, since it catalyses its own production. The greater the concentration 
of B , the faster it is produced by the reaction (13.7). If these two chemicals then 
react in a long thin tube, so that their concentrations only vary in the ^-direction, 
the main physical processes that act, in the absence of any underlying fluid flow, 
are chemical reaction and one-dimensional diffusion. We can derive the partial dif- 
ferential equations that govern this situation using the arguments that we described 
in Section 12.2.3, example 2. Specifically, since the rate of change of the amount 
of each chemical in a control volume is equal to the total diffusive flux through 
the bounding surface plus the total rate of production of that chemical within the 
volume, we find that 


da 

m 


= D Jj^2 ~ fca&2 ’ 


db 

dt 


= D^ + kab 2 , 

ox z 


(13.8) 


where t is time and D the constant diffusivity of the chemicals. If a small amount 
of the autocatalyst is introduced locally into a spatially uniform expanse of A with 
a = ao, waves of chemical reaction will propagate away from the initiation site. We 
can study one of these by looking for a wave that propagates from left to right at 
constant speed v > 0, represented by a solution of the form a = a(y), b = b(y), 
where y = x — vt. Such a solution is called a travelling wave solution.): 

With a and b functions of y alone, (13.8) become nonlinear ordinary differential 


f A mole is a fixed number (Avogadro’s number Ri 6.02 X 10 23 ) of molecules of the substance, 
t For more background and details on reaction-diffusion equations and travelling wave solutions, 
see Billingham and King (2001). 
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equations, 

„ d 2 a da , l0 „ „ d 2 b db , 

D-—-+V- kab 2 = 0, + w- — )- kab 2 = 0, (13.9) 

dy z dy dy z ay 

to be solved subject to the boundary condition 

a — > ao, b — > 0 as y — * oo. (13.10) 

This represents the unreacted state ahead of the wave. If we now add equations 
(13.9), we obtain a linear differential equation for a + b, which we can solve to 
obtain 

a + b = ko + k\e~ vy l D . 


The boundary condition (13.10) then shows that ko = ao- We also require that a+b 
is bounded as y — > — oo, so that k\ = 0, and hence a + b = ao- We can therefore 
eliminate a from (13.9) and arrive at a second order system 


— = c, D— = —vc — kb 2 (ao — b ) , (13.11) 

dy dy 

subject to 

b — > 0, c— >0 as y — > oo. (13.12) 


Note that, since we require that both of the chemical concentrations should be 
positive for a physically meaningful solution, we need 0 ^ b ^ ao- 

The only equilibrium points of (13.11) are b = c = 0 and b = ao, c = 0. The 
second of these represents the fully reacted state behind the travelling wave, where 
all of the chemical A has been converted into the autocatalyst, B , so we also require 
that 


cto, c — » 0 as y — > — oo. 


(13.13) 


We can write the boundary value problem given by (13.11) to (13.13) in a more 
convenient form by defining the dimensionless variables 


a 0 


ka% 


0=-> 7=\ -ft- z =\ -rrV’ V = 


so that 


subject to 


d/3 


D ao 

d'y 


kak 


D 


Vka^D 1 


, =7, — = -V')-P (1-/3), 
dz dz 


(3 — > 0, 7 — s- 0 as z — > oo, 
(3 — » 1, 7 — > 0 as z — > —oo, 


(13.14) 

(13.15) 

(13.16) 


and 0 ^ (3 ^ 1 for a physically meaningful solution. The dimensionless wave speed, 
V, is now the only parameter. 

The next step is to study the solutions of (13.14) subject to (13.15) and (13.16) in 
the (/?, 7) phase plane and determine for what values of the wave speed, V, solutions 
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exist. As a preliminary, let’s see if we can guess a solution of this system. A plausible 
functional form is 7 = k(3( 1 — (3), which satisfies the boundary conditions. Since 


dl l 2 (l - (3) 

d/3 7 


(13.17) 


we find that we can satisfy this equation with k = — 1 /a/ 2 and V = l/\/2. This 
gives 


and hence 


(3 = (3 e 


d/3 

dz 




1 

1 7 7777> 7 = 7e 

1 -|_ e (z-zo)/V2 


l e (z-z 0 )/V2 

V2(l + e (z:-zo)/v / 2)2 ' 


(13.18) 


This is an exact solution for V = l/\/2 and any constant zo, as shown in Fig- 
ure 13.3. The presence of Zq in (13.18) simply shows that the solution can be given 
an arbitrary displacement in the ^-direction and remain a solution, as we would 
expect for a wave that propagates at constant speed without change of form. 



Fig. 13.3. The analytical solution of the cubic travelling wave problem, /? = (3 e (z), with 
z 0 = 0 . 

So now we know that a solution exists for V = l/-\/2- What about other values of 
VI Let’s go along our usual route, and determine the nature of the two equilibrium 
points, P\ = (1,0) and P 2 = (0,0). The Jacobian is 
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At Pi, the eigenvalues are real and of opposite sign, so that Pi is a saddle point. 
Boundary condition (13.16) shows that we need the integral path that represents 
the solution to asymptote to Pi as 2 — > — 00 . Since the unstable separatrices of Pi 
are the only integral paths that do this, the solution must be represented by the 
unstable separatrix of Pi that lies in the physically meaningful region, 0 ^ (3 ^ 1, 
shown in Figure 13.4 as S±. The other boundary condition, (13.15), shows that 
we need Si to asymptote to the other equilibrium point, P 2 , as 2 — > 00 , if it is to 
represent a solution. 



Fig. 13.4. The local behaviour in the neighbourhood of the two equilibrium points. 

At P 2 = (0,0), the eigenvalues are — V and zero, with associated eigenvectors 
e_ = ( 1 ,— V) and e 0 = (1,0), respectively, so that this is a nonhyperbolic equi- 
librium point. Since V > 0, there is a local stable manifold in the direction of e 
and also a local centre manifold tangent to the /3-axis (the direction of e 0 ). We 
can construct a local approximation to the centre manifold, 7 = h{/ 3), by assuming 
that 7 ~ A/3 2 as /3 — > 0 for some constant A. The governing equations, (13.14), 
then show that 

^ ~ 2 A/3 ^ ~ 2 A/3 7 ~ 2A 2 /3 3 , 
dz dz 

and hence that 

2A 2 /3 3 ~ - VA/3 2 - /3 2 (1 - (3). 

By equating coefficients of /3 2 , we find that A = — 1/V, and hence that the local 
centre manifold has 7 ~ —(3 2 /V as (3 — > 0. This means that 7 = df3/dz < 0 on 
the local centre manifold. Points on the centre manifold in (3 > 0 are therefore 
attracted to P 2 as z — > 00 with, from d(3/dz ~ — / 3 2 /V , (3 ~ V/z as 2 — > 00 . In 
contrast, points on the centre manifold in (3 < 0 are swept away as 2 increases. This 
type of behaviour is characteristic of a new type of equilibrium point, known as a 
saddle— node. To the right of the stable manifold, the point behaves like a stable 
node, attracting integral paths onto the centre manifold and into the origin, whilst 
to the left of the stable manifold the point is unstable, as shown in Figure 13.4. 
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Apart from the local stable manifold, on which (3 = 0(e~ v z ) as 2 — » oo (recall that 
the negative eigenvalue of P 2 is —V), any other integral paths that asymptote to 
the origin as z — > 00 do so on the centre manifold, with (3 ~ V / z. 

Finally, for any given value of V, we need to determine whether the integral path 
Si lies to the right of the stable manifold of P 2 , which we label S 2 , and therefore 
enters the origin and represents a solution, or whether Si lies to the left of S 2 , is 
swept away from the origin into (3 < 0, and therefore does not represent a physically 
meaningful solution. We will return to this problem in Section 13.3.3. 


13.1.2 The Stable, Unstable and Centre Manifolds 

We end this section by defining more carefully what we mean by a manifold, 
and generalizing the definitions of the stable, unstable and centre manifolds to ?r th 
order systems. 

Let’s begin with some definitions. A homeomorphism is a mapping f : L —> N 
that is one-to-one, onto and continuous and has a continuous inverse. Here, L and 
N are subsets of R". A C k diffeo morphism is a mapping / : L — > N that is 
one-to-one, onto and k times differentiable with a fc-times differentiable inverse. A 
smooth diffeomorphism is a C°° diffeomorphism and a homeomorphism is a C° 
diffeomorphism. An ?n-dimensional manifold is a set M C R" for which each 
x £ M has a neighbourhood U in which there exists a homeomorphism 4> '■ U — > M m , 
where m ^ n. A manifold is said to be differentiable if there is a diffeomorphism 
rather than a homeomorphism <j> : U —* R m . For example, a smooth curve in 
R 3 is a one-dimensional differentiable manifold, and the surface of a sphere is a 
two-dimensional differentiable manifold. 

Now that we have defined these ideas, let’s consider the behaviour of the solutions 
of 

! = < m9 > 

where x, f(x) £ R”. In the neighbourhood of an equilibrium point, x, of (13.19), 
there exist three invariant manifolds. 

(i) The local stable manifold, 0 Cj s oc , of dimension s, is spanned by the eigen- 
vectors of A whose eigenvalues have real parts less than zero. 

(ii) The local unstable manifold, W}“ c , of dimension u, is spanned by the 
eigenvectors of A whose eigenvalues have real parts greater than zero. 

(iii) The local centre manifold, uf oc , of dimension c, is spanned by the eigen- 
vectors of A whose eigenvalues have zero real parts. 

Note that s + c + u = n. Solutions lying in tU[ S oc are characterized by exponential 
decay and those in by exponential growth. The behaviour on the centre mani- 
fold is determined by the nonlinear terms in (13.19), as we described earlier in this 
section. For a linear system these manifolds exist globally, whilst for a nonlinear 
system they exist in some neighbourhood of the equilibrium point. 
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Example 

Let’s consider the simple system 

x = x(2 — x), y = — y + x, (13.20) 

and try to determine the equations of the local manifolds that pass through the 
equilibrium point at the origin. The Jacobian at the origin is 



which has eigenvalues 2 and —1 and associated eigenvectors of (3, 1) T and (0, 1) T 
respectively, so that c is the line y — xj 3 and ccf oc is the y- axis. We can solve 
the nonlinear system directly, since the equation for x is independent of y, and the 
equation for y is linear. The solution is 

2 a f 4 A 

g = A + 2e~ t ’ y = e ~ t [ B + 2et ~- 4 1 °g(^e t + 2)|, 

where A / 0 and B are constants. There is also the obvious solution x = 0, 
y = Be ~ t , the y-axis, which gives the local stable manifold, Wj s oc , and also the 
global stable manifold, cc s (x). Points that lie in the local unstable manifold, 
w ioc’ have y — * 0 as t — » — oo. Since y ~ e _t ( B — 41og2/^4) as t — > — oo, we must 
have B = A + 1 /A 1 c>g 2. so that the global unstable manifold, ud'fx). is given 
in parametric form by 

(jTIFf- e_ ‘{' 4+ 3 1 ^ 2 + 2e ‘-3 1 »^* + 2 »})- 

The phase portrait is sketched in Figure 13.5. 



Fig. 13.5. The phase portrait of the system (13.20). 
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13.2 Lyapunov’s Theorems 

Although we are now familiar with the idea of the stability of an equilibrium point 
in an informal way, in order to develop the idea of a Lyapunov function, we need 
to consider some more formal definitions of stability. Note that, in this section, 
we will consider systems of autonomous ordinary differential equations, written in 
vector form as x = f(x), that have an isolated equilibrium point at the origin. 

An equilibrium point, x = x e , is Lyapunov stable if for all e > 0 there exists 
a 6 > 0 such that for all x(0) with |x(0) — x e | < <5, |x(i) — x e | < e for all t > 0. In 
other words, integral paths that start close to a Lyapunov stable equilibrium point 
remain close for all time, as shown in Figure 13.6. 


|x - x e | < e 

V t > 0 



Fig. 13.6. A Lyapunov stable equilibrium point of a second order system. 

An equilibrium point, x = x e , is asymptotically stable if there exists a 6 > 0 
such that for all x(0) with |x(0) — x e | < 6, |x(t) — x e | — > 0 as t — > oo. This is 
a stronger definition of stability than Lyapunov stability, and states that integral 
paths that start sufficiently close to an asymptotically stable equilibrium point are 
attracted into it, as illustrated in Figure 13.7. Stable nodes and spirals are both 
Lyapunov and asymptotically stable. It should be clear that asymptotically stable 
equilibrium points are also Lyapunov stable. However, Lyapunov stable equilibrium 
points, for example centres, are not necessarily asymptotically stable. 

Now that we have formalized our notions of stability, we need one more new 
concept. Let V : O — > R, where H C K” and 0 € fl. We say that the function V is 
positive definite on if and only if V ( 0 ) = 0 and V (x) > 0 for x £ !!. x / 0 . For 
example, for n = 3, V'(x) = V{x\,X 2 ,x$) = x\ + x\ + x\ is positive definite in R 3 , 
whilst V(xi,X 2 ,x$) = X‘ 2 , is not, since it is zero on the plane X 2 = 0, and not just 
at the origin. Note that if — V(x) is a positive definite function, we say that V is a 
negative definite function. In the following, we will assume that V is continuous 



382 STABILITY, INSTABILITY AND BIFURCATIONS 



Fig. 13.7. An asymptotically stable equilibrium point of a second order system. 


and has well-defined partial derivatives with respect to each of its arguments, so 
that V € C x (fi). 

We can now introduce the idea of the derivative of a function V (x) with respect 
to the system x = f(x) = (f lt f 2 , . ■ ■ , /„), which is defined to be the scalar product 

F*(x) = VV ■ f(x) = J^/iM + ' ' ' + !£/»<*>• 

This derivative can be calculated for given V(x) and f(x), without knowing the 
solution of the differential equation. In particular, 

dV dV . dV . dV , . . dV , . , T „. . 

-W = wr x i + • • • + wr x n = ^—f iW + • • • + /«( x ) = V (x), 


dt dx\ 


dx n dx \ ' 


so that the total derivative of V with respect to the solution of the equations 
coincides with our definition of the derivative with respect to the system. This 
allows us to prove three important theorems. 

Theorem 13.2 If, in some region C K n that contains the origin, there exists 
a scalar function V(x) that is positive definite and for which V*(x) ^ 0, then the 
origin is Lyapunov stable. The function V(x) is known as a Lyapunov function. 

Proof Since V is positive definite in Cl, there exists a sphere of radius r > Of 
contained within Cl such that 

V(x) > 0 for x/O and |x| < r, 


V*(x) ^ 0 for |x| < r. 


f the set of points with |x| ^ r in R n . 
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Let x = x(<) be the solution of the differential equation x = f(x) with x(0) = x 0 . 
By the local existence theorem, Theorem 8.1, extended to higher order systems, 
this solution exists for 0 ^ t < t* with t* > 0. This solution can then be continued 
for t ^ t*, and we denote by t\ the largest value of t for which the solution exists. 
There are two possibilities, either t\ = oo or t\ < oo. We now show that, for |xo| 
sufficiently small, t\ = oo. 

From the definition of the derivative with respect to a system, 

C ^(x(t)) = V*(x(t)) for 0 < t < t\. 
at 

We can integrate this equation to give 

V(x(t)) - V(xo) = [ V*(x(s))ds^ 0, 

Jo 

since V* is negative definite. This means that 0 < V(x(t)) < y(x 0 ) for 0 < t < t\. 
Now let e satisfy 0 < e ^ r, and let S be the closed, spherical shell with inner and 
outer radii e and rf. By continuity of V, and since S is closed, /i = min x6 5 V r (x) 
exists and is strictly positive. Since F(x) — > 0 as |x| — + 0, we can choose 6 with 
0 < 6 < p such that for |x 0 | ^ 6, V(xo) < /z, so that 0 < V(x(t)) ^ V(xo) < (x 
for 0 ^ t < t\. Since /i is the minimum value of V in S, this gives |x(t)| < e for 
0 ^ t < t\. If there exists <2 such that |x(t 2 )| = e, then, when t = t 2 , we also 
have, from the definition of /z, /x < V"(x(f 2 )) ^ V(xo) < /i, which cannot hold. We 
conclude that t\ = oo, and that, for a given e > 0, there exists a 6 > 0 such that 
when |x 0 | < 6, |x(t)| < e for t ^ 0, and hence that the origin is Lyapunov stable. 

□ 

The proofs of the following two theorems are rather similar, and we will not give 
them here. 

Theorem 13.3 If, in some region £1 C that contains the origin, there exists 
a scalar function V(x) that is positive definite and for which V^(x) is negative 
definite, then the origin is asymptotically stable. 

Theorem 13.4 If in some region £1 C R n that contains the origin, there exists a 
scalar function V(x) such that F(0) = 0 and W'(x) is either positive definite or 
negative definite, and if, in every neighbourhood N of the origin with N C O, there 
exists at least one point a such that V(a) has the same sign as V*(&), then the 
origin is unstable. 

Theorems 13.2 to 13.4 are known as Lyapunov’s theorems, and have a geometri- 
cal interpretation that is particularly attractive for two-dimensional systems. The 
equation V(x) = c then represents a surface in the (x, y, F)-space. By varying c 
(through positive values only, since V is positive definite), we can obtain a series of 
contour lines, with V = 0 at the origin a local minimum on the surface, as shown 
in Figure 13.8. Since x = f(x), the vector field f represents the direction taken by 

f the set of points with e ^ |x| ^ r in R n . 
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an integral path at any point. The vector normal to the surface V(x) = c is W, 
so that, if V* = VV • f ^ 0, integral paths cannot point into the exterior of the 
region V(x) < c. We conclude that an integral path that starts sufficiently close 
to the origin, for example with V(xo) < ci, cannot leave the region bounded by 
V(x) = ci, and hence that the origin is Lyapunov stable. Similarly, if V* < 0, the 
integral paths must actually cross from the exterior to the interior of the region. 
Hence V will decrease monotonically to zero from its initial value when the integral 
path enters the region O, and we conclude that the origin is asymptotically stable. 




Fig. 13.8. (a) The local behaviour and (b) a contour plot of a Lyapunov function near the 
origin. 


Although we can now see why a Lyapunov function is useful, it can take consid- 
erable ingenuity to actually construct one for a given system. 

Example 1 

Consider the system 

x = -x- 2y 2 , y = xy- y 3 . 

The origin is the only equilibrium point, and the linearized system is x = —x, y = 0. 
The eigenvalues are therefore 0 and —1, so this is a nonhyperbolic equilibrium point. 
Let’s try to construct a Lyapunov function. We start by trying V = x 2 + ay 2 . This 
is clearly positive definite for a > 0, and V(0,0) = 0. In addition, 

V* = = 2x(—x — 2 y 2 ) + 2ay(xy — y 3 ) = — 2x 2 + 2{a — 2 )xy 2 — 2ay 4 . 

If we choose a = 2, then dV/dt = —2x 2 — 4 j/ 4 < 0 for all x and y excluding 
the origin. From Theorems 13.2 and 13.3 we conclude that the origin is both 
Lyapunov and asymptotically stable. As a general guideline, it is worth looking for 
a homogeneous, algebraic Lyapunov function when f has a simple algebraic form. 
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Example 2 

Consider the second order differential equation 9 + f{9) = 0 for —tt ^ 9 ^ tt, with 
Of (9) positive, / differentiable and /( 0) = 0. We can write this as a second order 
system, 

± 1 =X 2 , X 2 = -f(xi), 

where x\ = 9. The origin is an equilibrium point, but is it stable? In order to 
construct a Lyapunov function, it is helpful to think in terms of an equivalent 
physical system. By analogy with the model for a simple pendulum, which we 
discussed in Section 9.1, we can think of f(9) as the restoring force and 9 as the 
angular velocity. The total energy of the system is the sum of the kinetic and 
potential energies, which we can write as E = \ 9 2 + f Q f(s ) ds. If this energy 
were to decrease/not grow, we would expect the motionless, vertical state of the 
pendulum to be asymptotically/Lyapunov stable. Guided by this physical insight, 
we define 

v =\ x l + j o f{s)ds. 

Clearly G(0,0) = 0, and, since f* 1 f(s)ds is positive by the assumption that 
Of {9) ^ 0, V is positive definite for — tt < x\ < n. Finally, 

V * = fjfr = f( x i) a ’ 2 + x 2 ' ~f( x i) = °- 

By Theorem 13.2, V is a Lyapunov function, and the origin is Lyapunov stable. 

Example 3 

Consider the differential equation x + x + x + x 2 = 0. We can write this as a second 
order system, 

X\ = X 2 , X2 = — X\ - x\ - X 2 , (13.21) 

where x\ = x. This has two equilibrium points, at (—1, 0) and (0, 0). It is straight- 
forward to determine the eigenvalues of these equilibrium points and show that both 
are hyperbolic, with (—1,0) a saddle point and (0,0) a stable, clockwise spiral. We 
can now construct a Lyapunov function that will give us some idea of the domain 
of attraction of the stable equilibrium point at the origin. Consider the function 

V = -(x; 2 + x\) + -x\. 

This function vanishes at the origin and is positive in the region 

n = | (#1,2:2) | x\ > -x\ - ^#i| , 

which is sketched, along with the phase portrait, in Figure 13.9. 

Let’s consider the curve V = which passes through the saddle point and the 
point (|,0), as shown in Figure 13.10. If V = Vo < |, we have a curve that encloses 
the origin, but not the saddle point. By taking V = Vo < g arbitrarily close to 
g , we can make the curve V = Vq arbitrarily close to the saddle point. As we are 
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Fig. 13.9. The region O and the phase portrait of the system (13.21). The region £7 lies 
to the right of the curved, broken line, which is V = 0. 


interested in the domain of attraction of the equilibrium point at the origin, we will 
focus on Oo, the subset of f l given by V < Vo < g, with Vo close to 
Since 


V* = — = X 2 ( XI + xf) + ( — X\ —x\ — X 2 ) X 2 = —x\ < 0, 

we immediately have from Theorem 13.3 that the origin is Lyapunov stable. To 
prove asymptotic stability requires more work, as V* = 0 on x 2 = 0, which could 
allow trajectories to escape from the region Oo through the two points where V = Vq 
meets the aq-axis, which are labelled as A and B in Figure 13.10. There are various 
ways of dealing with this. The obvious one is to choose a different Lyapunov 
function, which is possible, but technically difficult. We will use a phase plane 
analysis. Consider Sa, the integral path through the point A. All other integral 
paths through the boundary of Qq hr the neighbourhood of A enter Qq- The integral 
path Sa cannot, therefore, lie along the boundary of flo. If Sa does not enter flo, 
it must intersect the integral path that enters Oo at x 2 = 0 + , which is not possible. 
We conclude that the integral path through A enters flo, as shown in Figure 13.11. 
A similar argument holds at B. 

Since the Lyapunov function, V, is monotone decreasing away from the aq-axis, 
there cannot be any limit cycle solutions in flo- Finally, since Oo has all integral 
paths entering it, and contains a single, stable equilibrium point and no limit cycles, 
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Fig. 13.10. The regions fl and 17* and the curves V = 0, V = | and V = Vo < 


we conclude from the Poincare-Bendixson theorem, Theorem 9.4, that all integral 
paths in S7 q enter the origin, which is therefore asymptotically stable, and that 
Oq lies within the domain of attraction of the origin. In fact, we can see from 
Figure 13.9 that this domain of attraction is considerably larger than S7 0 , and is 
bounded by the stable separatrices of the saddle point at (—1,0). 


Consider the system 


Example 4 


x = x 2 - y 2 , y = -2 xy. 

This is a genuinely nonlinear system with an equilibrium point at the origin. The 
linear approximation at the origin is x = y = 0, so both eigenvalues are zero. Let’s 
try a Lyapunov function of the form V = axy 2 — x 3 , which has P(0,0) = 0. Since 

V* = = (ay 2 — 3 x 2 )(x 2 — y 2 ) — 4 ax 2 y 2 = 3(1 — a)x 2 y 2 — ay 4 — 3a: 4 , 

we can choose a = 1, so that V* = —y 4 — 3a; 4 , which is negative definite. We can 
see that V = a :(y 2 — x 2 ) = 0 when x = 0 or y = ±x, so that V changes sign six 
times on any circle that surrounds the origin. In particular, in every neighbourhood 
of the origin there is at least one point where V has the same sign as V*, so that 
all of the conditions of Theorem 13.4 are satisfied by V, and hence the origin is 
unstable. 
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Fig. 13.11. The phase portrait in the neighbourhood of the point A and the saddle point 
at (—1, 0). 


Further refinements exist of the Lyapunov Theorems 13.2 to 13.4 that we have 
studied in this section, and the interested reader is referred to Coddington and 
Levinson (1955) for further information. 


13.3 Bifurcation Theory 

13.3.1 First Order Ordinary Differential Equations 

Let’s consider the first order ordinary differential equation, (9.5), whose hyper- 
bolic equilibrium points we studied in Section 9.2. For a hyperbolic equilibrium 
point at x = x\, we saw that a simple linearization about x = x\ determines the 
local behaviour and stability. If X'(xi) = 0, x = x\ is a nonhyperbolic equilibrium 
point, and we need to retain more terms in the Taylor expansion of X, (9.6), in 
order to sort out what happens close to the equilibrium point. For example, if 
X{x\) = X'(xi) = 0 and X"{x\) ^ 0, 


^ « ^X"(a;i)a; 2 for x < 1, 
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and hence 


2 

x"{ Xl ){t-t 0 y 


for some constant to- The graph of X(x\+x) close to x = 0 is shown in Figure 13.12 
for X"(x\ ) < 0. Focusing on this case, we can see that x — > 0 (x — » x\) as t — > oo 
for to < 0, whilst x — + — oo as t — > to for f o > 0. This nonhyperbolic equilibrium 
point therefore attracts solutions from x ^ X\ and repels them from x < x\. Note 
that the rate at which solutions asymptote to x = x\ from x > x\ is algebraic, 
in contrast to the faster, exponential approach associated with stable hyperbolic 
equilibrium points. 



Fig. 13.12. The graph of dx/dt = X(xi + x) « \ X"(xi)x 2 for X"(xi) < 0. 


A system that contains one or more nonhyperbolic equilibrium points is said 
to be structurally unstable. This means that a small perturbation, not to the 
solution but to the model itself, for example the addition of a small extra term to 
X(x), can lead to a qualitative difference in the structure of the set of solutions, for 
example, a change in the number of equilibrium points or in their stability. Consider 
the function shown in Figure 13.12. The addition of a small positive constant to 
X(x) would shift the graph upwards by a small amount, and give two equilibrium 
solutions, whilst the addition of a small negative constant would shift the graph 
downwards and lead to the absence of any equilibrium solutions. Let’s investigate 
this further. 

Consider the equation 


x = /J, — x 2 , 


(13.22) 
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where p is a parameter. For y > 0 there are two hyperbolic equilibrium points 
at x = ±y//Z, whilst for y < 0 there are no equilibrium points, as shown in Fig- 
ure 13.13. When y = 0, X(x) = —x 2 and x = 0 is a nonhyperbolic equilibrium 
point of the type that we analyzed above. We can now draw a bifurcation di- 
agram, which shows the position of the equilibrium solutions as a function of y, 
with stable equilibria as solid lines and unstable equilibria as dashed lines, as shown 
in Figure 13.14. The point /i = 0, x = 0 is called a bifurcation point, because 
the qualitative nature of the phase line changes there. The bifurcation associated 
with x = y — x 2 is called a saddle— node bifurcation, for reasons that will become 
clear in Section 13.3.2. Any first order system that undergoes a saddle-node bifur- 
cation, in other words one that contains a bifurcation point where two equilibrium 
solutions meet and then disappear, can be written in the form x = y — x 2 in the 
neighbourhood the bifurcation point. The equation x = y — x 2 is called the normal 
form for the saddle-node bifurcation. 



Fig. 13.13. Graphs of * = y — x 2 for (a) y > 0, (b) y < 0. 

Example 

Consider the ordinary differential equation 

y = X-2Xy-y 2 . (13.23) 

This has equilibrium points at y = — X ± %/ X 2 + A, so there are no real equilibrium 
points for —1 < A < 0 and two equilibrium points otherwise. This suggests that 
there are saddle-node bifurcations at A = 0, y = 0 and A = —1, y = 1. Now note 
that, at the equilibrium points, 

^ = — 2A -2 y = =f\/A 2 + A. 
ay 

Using the analysis of the previous section, we can see that the equilibrium point 
with the larger value of y is stable, whilst the other is unstable. The bifurcation 
diagram is shown in Figure 13.15, and certainly looks as if it contains two saddle- 
node bifurcations. 

For y<l and A -C 1, y = A — 2 A y — y 2 ss A — y 2 , which is precisely the normal 
form for the saddle-node bifurcation. All of the terms on the right hand side of 
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Fig. 13.14. The saddle-node bifurcation diagram. 



Fig. 13.15. The bifurcation diagram for y = A — 2A y — y 2 . 


(13.23) are small, but the only one that is sure to be smaller than at least one of 
the others is the second, since y <C 1 means that Xy <C A. We make no assumption 
about how big A is compared with y 2 . To examine the neighbourhood of the other 
bifurcation point, we shift the origin there using A = —1 + A/i, y = 1 + Bx, where 
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A and B are constant scaling factors that we will fix later. In terms of p and x, 
(13.23) becomes 

A A 

x = — —p — 2Apx — Bx 2 « — — p — Bx 2 for p <C 1 and i<1, 

B B 

By choosing A = — 1 and B = 1 this becomes the normal form for the saddle-node 
bifurcation. Note that the required change of coordinate, A = —1 — p, indicates 
that the sense of the bifurcation is reversed with respect to A, as can be seen in 
Figure 13.15. 

We can formalize the notion of a saddle-node bifurcation using the following 
theorem. 


Theorem 13.5 (Saddle— node bifurcation) Consider the first order differential 
equation 

x = f(x,p), 

with /(0,0) = f x ( 0,0) = 0. Provided that / M (0, 0) 0 and f xx ( 0,0) 0, there 

exists a continuous curve of equilibrium points in the neighbourhood of ( 0,0), which 
is tangent to the line p = 0 there. In addition, 

(i) if fn(0, 0)f xx (0, 0) < 0, then there are no equilibrium points in the neigh- 
bourhood of (0, 0) for p < 0, whilst for p > 0, in a sufficiently small neigh- 
bourhood of (0,0) there are two hyperbolic equilibrium points. 

(ii) if ffj.(0, 0)f xx (0, 0) > 0, then there are no equilibrium points in the neigh- 
bourhood of (0, 0) for p > 0, whilst for p <0, in a sufficiently small neigh- 
bourhood of (0,0) there are two hyperbolic equilibrium points. 

If f xx ( 0 , 0 ) < 0 , the equilibrium point with the larger value of x is stable, whilst the 
other is unstable, and vice versa for f xx ( 0 , 0 ) > 0 . 


Proof We will give an informal proof. Since /( 0,0) = f x ( 0,0) = 0, the Taylor 
expansion of f(x, p) about (0, 0) shows that 

x ~ A 0 (p) + Ai(p)x + A 2 (p)x 2 for \x\ <C 1 and \p\ <C 1, 


where 

A o(p) = f^(0,0)p+^f fJ , l _ l (0,0)p 2 , Ai(p) = f xtl (0,0)p, A 2 (p) = ^f xx ( 0,0). 
There are therefore equilibrium points at 


x = 


-^1 ± sjA\ - 4A 0 A 2 
2 A 2 


-2/ m (0 ,0)P 

fxx( 0,0) 


for \p\ <C 1. 


This shows that the location of the equilibrium points is as described in the theorem. 
Finally, at the equilibrium points, 

dx 

— ~ A\(p) + 2 A 2 (p)x ~ f xx (0,0)x, 
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since x = Od/il 1 / 2 ). The stability of the equilibrium points is therefore determined 
by the sign of f xx ( 0 , 0 ) . □ 

Example: The CSTR 

Many industrially important chemicals are produced in bulk using a continuous 
flow, stirred tank reactor (CSTR). This is simply a large container, to which fresh 
chemicals are continuously supplied and from which the resulting reactants are 
continuously withdrawn, as sketched in Figure 13.16. A stirrer ensures that the 
chemicals in the CSTR are well mixed. Let’s consider what happens when chemicals 
A and B that react through the cubic autocatalytic reaction step (13.7) are fed into 
a CSTR, and assume that the idea is to convert as much of the reactant A into the 
autocatalyst B as possible. 



Fig. 13.16. A continuous flow, stirred tank reactor (CSTR). 

Since the CSTR is well stirred, we assume that the concentrations of the chem- 
icals are spatially uniform, and given by a(t) and 6 (f). The rate of change of the 
total amount of species A in the CSTR is equal to the rate at which it is produced 
by chemical reaction plus the rate at which it flows in minus the rate at which it 
flows out. If the CSTR has constant volume V, constant inlet and (by conservation 
of mass) outlet flowrate q and inlet concentration of A given by oo, we have 

(Va) = — Vkab 2 + q (oo — a) , 

and hence 

da , ,9 an — a 

— = -kab 2 + — . (13.24) 

at f res 

The residence time, t res = V/q, is the time it takes for a volume V of fresh 
reactants to flow into the CSTR, and characterizes the period for which a fluid 
element typically remains within the CSTR. Similarly, 

f = hub* + (13.25) 

at tres 

where &o is the inlet concentration of the autocatalyst, B. 

We now define dimensionless variables 

a b 2 

a = — , p = — , r = ka 0 t, 
a 0 a 0 
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in terms of which (13.24) and (13.25) become 


da 

dr 


-a/3 2 + 


1 — a 

Tres 


(13.26) 


where 


dp 

dr 


a/3 2 + 


/3q — /3 

7"res 


— hd^t^e s> A) 


&0 

ao 


We can now add (13.26) and (13.27) to obtain 


d 

dr 


(a + /3) 


1 + /?0 — ( c ^ + /?) 

7"res 


(13.27) 


a linear equation that we can solve for a + ( 3 using an integrating factor. The 
solution is 


a + /3=l + /3 0 + fce T/Tre % 


where k is a constant. Clearly, a + /3— > 1 + /3o as r — > oo. In particular, for 
r » T res , and hence for t ^$> t res , a + /3 ~ 1 + /3o- The term ke~ T ^ Tres represents 
an initial transient, which decays to zero exponentially fast over a time scale given 
by the residence time. We therefore assume that a + /3 = 1 + /3o, and can thereby 
eliminate (3 from (13.26). In the analysis that follows, it is more convenient to work 
in terms of z = 1 — a, the extent to which the reactant A has been converted to B. 
This gives us a nonlinear, first order ordinary differential equation for z, 

^ = R(z)-F(z ), (13.28) 

where 

R(z) = (l~z)(z + P 0 ) 2 , F(z)=—. 

Tres 

We can now see that a steady state, where dz/dr = 0, occurs when the rate of 
reaction, R(z), is balanced by the rate at which A flows into the CSTR, F(z), and 
hence when F(z) = R(z). The function R(z) is a cubic polynomial, whilst F(z) 
is a straight line through the origin. The steady states are therefore given by the 
points of intersection of these two curves. 


Case 1: No autocatalyst in the inflow, flo = 0 

When /3o = 0, R(z) has a repeated root at 2 = 0, as shown in Figure 13.17. 
The straight line F(z) always passes through z = 0, which is therefore always a 
steady state. The state z = 0 represents a CSTR that contains no autocatalyst, 
just the reactant A supplied by the inlet flow. A simple calculation of the steady 
state solutions shows that for r res < 4, 2 = 0 is the only steady state, whilst for 
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r res > 4, there are two further points of intersection between F(z) and R(z ), and 
hence two other steady states, at z = z\ and z = Z 2 , where 



Note that 0 < zi < \ and \ < z-i < 1, and Z\ — > 0, Z 2 — > 1 as r res — > oo. The 
steady states are sketched in Figure 13.18. The stability of the steady states is 
easily calculated, and we find that z = 0 and z = Z 2 are stable states, whilst z = z\ 
is unstable. There is a saddle-node bifurcation at r res = 4. This situation, with 
Po = 0, is a realistic one, since it is probably desirable to run the system with just 
a single species entering the CSTR. We clearly want to run the CSTR in the state 
z = Z 2 , where more than half of the reactant A is converted to B. However, we also 
want to make the residence time as small as possible, to increase the rate at which 
B is produced. However, we can now see that, if r res is slowly decreased, Z 2 will also 
slowly decrease, and that when T res decreases past 4, the saddle-node bifurcation 
point, the situation changes dramatically. The only available steady state is z = 0, 
so that no autocatalyst remains in the CSTR, and the reaction stops. This is 
known as washout. Attempts to recover the desirable state z = Z 2 by increasing 
the residence time, r res , are doomed to failure when there is no autocatalyst entering 
the CSTR. 



Fig. 13.17. The curves R(z) and F(z) when /?o = 0. 
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Fig. 13.18. The bifurcation diagram when (3o = 0. 


Case 2: Autocatalyst in the inflow, fio > 0 

Let’s now see how the situation is affected by including some autocatalyst in 
the inflow. When /3 0 > 0, the cubic polynomial R(z) has a single positive root at 
z = 1, and is strictly positive for 0 ^ z < 1, as shown in Figure 13.19. In order to 
determine when F(z) is tangent to R(z), we must simultaneously solve R(z) = F(z ) 
and R'(z ) = F'(z), which gives 

(z + fa) 2 (1 - z) = — , (13.29) 

7"res 

( z + flo) ( — 3z + 2 — flo) = . (13.30) 

7"res 

Eliminating r res between these two equations gives 2 z 2 — z + /3 q =0, and hence the 
points of tangency are at 

z = z± = | (l ± V 1 ~ 8 /?o) • 

We conclude that there are no such points of tangency for /3 0 > g ^ and hence 
that there is a unique stable solution, as shown in Figures 13.19 and 13.20. For 
0 < flo < ^ , there is a unique solution for 0 ^ r res ^ t + and r res > t_, and three 
solutions for r + < r res < r_, where, from (13.30), 

1 

(z± + flo) (2 — /?o — 3 z±) 

There are now two saddle-node bifurcations, at T res = t± . 
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Fig. 13.19. The curves R{z) and F(z) when /3o > 0. 


We conclude that for /3 0 > g, there is no possibility of washout, just a unique 
steady state solution for any given residence time, r res . For 0 < f3 < |, we again 
have the possibility, if not of a washout of the autocatalyst, at least of a dramatic 
decrease in the concentration of B when r res falls below r + . However, if r res is 
now increased again, there will be a dramatic increase in the concentration of B as 
r res increases past r_. This change in the steady state from when a parameter is 
increased to when it is decreased is known as hysteresis. 

Finally, note that when f3 = | the two saddle-node points merge and disappear 
at r res = |l . This is itself a bifurcation, and is known as a co dimension two 
bifurcation. Such a bifurcation can only occur in a system that has at least two 
parameters (here /3q and r res ). 

We will study two other types of bifurcation. Consider the normal form 

x = fjix — x 2 . (13.31) 

This system has equilibrium points at x = 0 and x = fi. When /i = 0 we again have 
the nonhyperbolic equilibrium point given by x = — x 2 , whilst when /i / 0 there 
are always two equilibrium points. In this case, 


dx 

dx 


= H — 2x = 


f-i at x = /x, 
/d at x = 0, 
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Fig. 13.20. The bifurcation diagram when j3o > 0. 


and hence 

_ . J stable for y > 0, 

^ 1S \ unstable for y < 0, 

_ . J unstable for y > 0, 

X M1S \ stable for y < 0. 

The bifurcation diagram is shown in Figure 13.21. This is called a transcritical 
bifurcation. At the bifurcation point, the two equilibrium solutions pass through 
each other and exchange stabilities, so this sort of bifurcation is often referred to 

as an exchange of stabilities. 

Theorem 13.6 (Transcritical bifurcation) Consider the first order differential 
equation 

x = f{x, y), 

with /( 0,0) = f x ( 0,0) = / M ( 0,0) = 0. Provided that f xx ( 0,0) ^ 0 and f£ x ( 0,0) - 
f xx (0, 0)/ A , M (0, 0) > 0, there exist two continuous curves of equilibrium points in 
the neighbourhood of ( 0,0). These curves intersect transversely at (0,0). For each 
/i / 0 there are two hyperbolic equilibrium points in the neighbourhood of x = 0. If 
fxx (0, 0) < 0, the equilibrium point with the larger value of x is stable, whilst the 
other is unstable, and vice versa for f xx { 0, 0) > 0. 


We will leave the informal proof as Exercise 13.9. 
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Fig. 13.21. The transcritical bifurcation. 


Finally, consider the normal form 


x = px — x 3 . (13.32) 

This has a single equilibrium point at x = 0 for p < 0, and three equilibrium 
points at x = 0 and x = ±y/p for p > 0. The bifurcation diagram is shown 
in Figure 13.22(a). The bifurcation at x = 0, p = 0 is called a supercritical 
pitchfork bifurcation. A similar bifurcation, in which the two new equilibrium 
points created at the bifurcation point are unstable, is the subcritical pitchfork 
bifurcation, with normal form x = fix + x 3 , whose bifurcation diagram is shown 
in Figure 13.22(b). 

Theorem 13.7 (Pitchfork bifurcation) Consider the first order differential 
equation 


x = f(x,p), 

with /( 0,0) = /a, (0,0) = / M (0,0) = f xx { 0,0) = 0. Provided that / MX (0,0) ^ 0 
and fxxx{ 0,0) 0, there exist two continuous curves of equilibrium points in the 
neighbourhood of { 0, 0). One curve passes through (0, 0) transverse to the line p = 0, 
whilst the other is tangent to p, = 0 at x = 0. In addition, 

(i) if / MX (0, 0)/ xxx (0, 0) < 0, then, close to (0,0), there is a single equilibrium 
point for g, < 0 and three equilibrium points for p > 0. 

(ii) if / MX (0, 0)/ xxx (0, 0) > 0, then, close to (0,0), there is a single equilibrium 
point for /i > 0 and three equilibrium points for p < 0. 

If fxxx( 0,0) < 0, the single equilibrium point and the outer two of the three equilib- 
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rium points are stable, whilst the middle of the three equilibrium points is unstable, 
and vice versa for f xxx ( 0, 0) > 0. 

We leave the informal proof as Exercise 13.9. 


(a) supercritical 




x 



stable 

unstable 


Fig. 13.22. The (a) supercritical and (b) subcritical pitchfork bifurcation. 


Although these bifurcations seem rather abstract, they often describe the quali- 
tative behaviour of complicated physical systems, as we saw for the CSTR. Another 
simple example is the buckling beam. Consider a straight beam of elastic material 
under compression from two equal and opposite forces along its axis, one applied 
at each end, as shown in Figure 13.23(a). Now consider the displacement, x, of the 
midpoint of the beam. For sufficiently small applied forces, the beam undergoes a 
simple axial compression and x = 0. As the applied forces are slowly increased, at a 
critical value the beam buckles, as shown in Figure 13.23(b), because the unbuckled 
state becomes unstable. If the beam has perfect left-right symmetry, it is equally 
likely to buckle left or right. When the displacement of the midpoint of the beam 
is plotted as a function of magnitude of the applied forces, we obtain the bifurca- 
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tion diagram corresponding to a supercritical pitchfork, as in Figure 13.22(a). Of 
course, in practice, a beam, for example a ruler held between your forefingers, will 
not have perfect symmetry, and will buckle in a preferred direction. This suggests 
that the pitchfork bifurcation is itself structurally unstable. You can investigate 
this by trying Exercise 13.7. 


(a) 


F 0 (b) 


Fi>F 0 


(7 


Fo 


Fi>F 0 


Fig. 13.23. An elastic beam under compression (a) before and (b) after buckling. 


13.3.2 Second Order Ordinary Differential Equations 

As for first order equations, second order systems that possess an equilibrium 
point with an eigenvalue with zero real part are structurally unstable. A small 
change in the governing equations can change the qualitative nature of the phase 
portrait. For example, the simple, frictionless pendulum, with phase portrait shown 
in Figure 9.2(b), contains a nonlinear centre. Since a centre has two eigenvalues 
with zero real part, we conclude that this system is structurally unstable. In terms 
of the physics, consider what happens if we allow for a tiny amount of friction. 
This will gradually reduce the energy and the amplitude of the motion, with the 
pendulum being brought to rest as t — > oo. This is reflected in what happens to the 
phase portrait when a term ft dO/dt , with /i > 0, is added to the left hand side of 
(9.2) to model the effect of friction. No matter how small /r is, the nonlinear centre 
is transformed into a stable focus, and all integral paths asymptote to 9 = 9 = 0 as 
t — * oo (see Exercise 9.6). 

Now consider the second order system 


x = n- x 2 , y = -y. 


(13.33) 
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For /z < 0 there are no equilibrium points, whilst for /i > 0 there are two equilibrium 
points at P + = (+ v //I, 0) and P_ = (— v fji, 0). The point P + is a stable node, whilst 
P- is a saddle point. When /z = 0 there is a single, nonhyperbolic equilibrium point 
at (0, 0), a saddle-node. The various phase portraits are shown in Figure 13.24. The 
point /z = 0, x = y = 0 is called a saddle— node bifurcation point, and (13.33) 
is its normal form. At such a point, a saddle and a node collide and disappear. 
Note that, since y = —y, all integral paths asymptote to the x-axis as t — > oo, 
where the dynamics are controlled by x = fi — x 2 . This is just the normal form 
of the saddle-node bifurcation in a first order equation. In effect, we can ignore 
the dynamics in the (/-direction, and concentrate on the dynamics on the x-axis, 
which is the centre manifold for the bifurcation. This shows why it was important 
to study bifurcations in first order systems, since the important dynamics of higher 
order systems occur on a lower order centre manifold. For more on bifurcation 
theory, see Arrowsmith and Place (1990). 

Two other simple types of bifurcation are (see Exercise 13.8) the transcritical 
bifurcation, with normal form 

x = fix — x 2 , y = —y, (13.34) 

and the supercritical and subcritical pitchfork bifurcations, with normal 
forms 

x = /zx±x 3 , y = — y. (13.35) 

The behaviour on the x-axis is, in each case, analogous to the corresponding bifur- 
cation in a one-dimensional system, which we studied in Section 13.3.1. In each of 
these examples, one eigenvalue passes through zero at the bifurcation point. We 
will not consider the more degenerate case where both eigenvalues pass through 
zero (see Guckenheimer and Holmes, 1983). However, we will consider one other 
type of bifurcation, which occurs when a complex conjugate pair of eigenvalues 
passes through the imaginary axis away from the origin. This is called the Hopf 
bifurcation, and has no counterpart in first order systems. 

The normal form of the Hopf bifurcation is most easily written in terms of polar 
coordinates as 

f = fir + ar 3 , 9 = u. (13.36) 

Let’s assume that a < 0. For /z < 0 there is a stable focus at the origin, and no other 
equilibrium points, and all trajectories spiral into the origin from infinity. For /z > 0 
there is an unstable focus at the origin, and no other equilibrium points, but there 
is also a stable limit cycle, given by r = \J — y,/a, 9 = tu. Trajectories spiral onto the 
limit cycle both from the origin and from infinity, as shown in Figure 13.25. When 
/i = 0a linear analysis shows that the origin is a centre, and hence has eigenvalues 
with zero real part. However, this is an example where the nonlinear terms, here 
f = ar 3 , cause the integral paths to spiral into the equilibrium point. Note that 
this is a supercritical Hopf bifurcation. The subcritical Hopf bifurcation, 
for which the limit cycle is unstable and exists for /z < 0, occurs when a > 0. In 
general, for equations that model real physical systems, any limit cycle solutions 
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Fig. 13.24. A saddle-node bifurcation in a second order system, for (a) /r > 0, (b) At = 0, 
(c) /i < 0. 

that exist are formed at Hopf bifurcations, and it is therefore crucial to determine 
the position of these bifurcations. In order to demonstrate that a Hopf bifurcation 
occurs when a pair of eigenvalues crosses the imaginary axis, we can appeal to the 

Hopf bifurcation theorem. 


Theorem 13.8 (Hopf) Consider the second order system 


x = X(x,y;fj), y = Y(x,y; At). 


Suppose that 
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(a) |i ^ 0 (b) g > 0 




Fig. 13.25. A Hopf bifurcation for (a) a < 0, fi ^ 0 and (b) a < 0, p > 0. 


(i) X(0, 0;/i) = y(0,0;/^) = 0 for each p £ [— potFo] for some po > 0, 

(ii) the Jacobian matrix evaluated at the origin with p = 0 is 

0 — u> 

LO 0 


for some to ^ 0, 

(iii) the eigenvalues of the equilibrium point at the origin are complex conjugate 
for p £ [— and given by a(p) ± i/3(p), with a and ft real, 

(iv) a ^ 0, where 


a — — — (X x 
16 v 


T ^ xxy X x yy ^Jyyy) 


+ Y^ {Xxy (X xx + Xyy) — Y X y ( Y XX + Yyy) — X XX Y XX + XyyYyy} , (13.37) 
with all of these partial derivatives evaluated at the origin with /i = 0. 


Then 


(i) If aa'(0) < 0, a unique limit cycle solution bifurcates from the origin in 
fi > 0 as p, passes through zero. For p, < 0 there exists a neighbourhood of 
the origin that does not contain a limit cycle solution. The stability of the 
limit cycle is the same as that of the origin in n < 0. 

(ii) If aa'( 0) > 0, a unique limit cycle solution bifurcates from the origin in 
H < 0 as p, passes through zero. For p, > 0 there exists a neighbourhood of 
the origin that does not contain a limit cycle solution. The stability of the 
limit cycle is the same as that of the origin in n > 0. 

The amplitude of the limit cycle grows like and its period tends to 27 t/|cd| as 

H — > 0 . 


We will not prove this theorem here. The idea of the proof is to go through a series 
of algebraic transformations of the differential equations that reduce them to the 
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normal form, (13.36), and is algebraically very unpleasant. For as straightforward 
an explanation of the details as you could hope for, see Glendinning (1994). Instead, 
we will concentrate on a concrete example to show how the proof of the theorem 
works. 


Example 

Consider the second order system 

x = X(z,y\p) = yx-ojy + x 3 , y = Y(x, y; p) = ux + py + y 2 , (13.38) 

where /i and ui are constants. The linear part of the system is in the normal form for 
an equilibrium point at the origin with complex conjugate eigenvalues A = y + iio, 
which we discussed in Section 9.3.2. This means that the origin is a stable focus 
for /i < 0, a linear centre for p = 0 and an unstable focus for p > 0. The system 
therefore satisfies conditions (i), (ii) and (iii) of Theorem 13.8. If we now substitute 
this particular choice of X and Y into (13.37), we find that a = 3/8, and hence 
that aa'(O) = 3/8 > 0. The Hopf bifurcation at /z = 0 is therefore subcritical, with 
an unstable limit cycle emerging from the origin in p, < 0 . 

Let’s now try to write (13.38) in the normal form (13.36) for sufficiently small 
x, y and /i, using the same transformations as those used in the proof of the Hopf 
bifurcation theorem. We begin by defining z = x + iy, in terms of which (13.38) 
can be written concisely as 

z = \z — —i (z — z *) 2 + - (z + z *) 3 . (13.39) 

The next step is to make a quadratic near identity transformation, 

z = w + aiw 2 + a 2 WW* + a^w* 2 . (13.40) 

The idea is that such a transformation leaves the linear part of the equation un- 
changed when written in terms of w, but can be used to simplify the nonlinear 
part by an appropriate choice of the constants oi, a 2 and 03 . Specifically, we can 
eliminate quadratic terms from (13.39). For \z\ <C 1, 

w = z — a\w 2 — a 2 WW* — a^w * 2 = z — a\(z — apw 2 — a 2 ww* — a^w * 2 ) 2 

( 2 4=2^ / 2 + 4= 2 \ 

z — a\W — CL 2 WW — CL 3 W ) [z — a x w — a 2 ww — a 3 w j 
—a 3 yz — a x w — a 2 ww — a 3 w ) = z — a\z — a^zz — a^z + yza x + a 2 d 3 ) z 

H - (c \CL\CL 2 ~\~ 02^2 ^^ 1 ) Z^ Z* 

+ (20403 + 0^02 + a\ + 20 ^ 03 ) zz * 2 + (0203 + 20 ^ 03 ) z * 3 + 0 (|^|^). 

If we now take this expression, differentiate, and replace z and z* using (13.39), we 
arrive, after considerable effort, at 

w = Xz — -i (z — z *) 2 + - (z — z “) 3 
4 8 
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— (2a\Z + a 2 z*) j^A,? — -i(z — z*) 2 j- — (a 2 z + 2a 3 2:*) -|a* 2:* + -i (z — z*) 2 1 
+3 (2a\ + a 2 a* 3 ) Xz 3 + (3ai a 2 + a 2 a 2 + 2 a§) (2A + A*) 2V 

+ (2aia 3 + a\a 2 + a\ + 2^03) (A + 2A*) 22* 2 + 3 (a 2 a 3 + 2a> 3 ) A*,?* 3 + 0(\z\ 4 ). 

We can now use the definition, (13.40), of w to eliminate z from the right hand 
side, retaining only cubic terms and larger, and arrive at 

w = A (w + a\W 2 + a,2ivw* + a 3 w* 2 ) 

— ^z + 2a 3 A^ (w 2 + 2a\w 3 + 2 a 2 w 2 w* + 2 a^ww* 2 ) 

+ -i — a 2 (A + A*)^ {zuio* + a 3 w 3 + (a 3 + a 2 ) w 2 w* + (aj + a 2 ) ww* 2 + a 3 w* 3 } 
- Qz + 2a 3 A* j ( w * 2 + 2a\w* 3 + 2a 2 w* 2 w + 2 a%w*w 2 ) 

-h ~z {2a\w + a 2 w*) ( w — w*) 2 — ^z (a 2 u> + 2a 3 ze*) (w — w*) 2 + ^ (w + w*) 3 
+3 (2a 2 + a 2 ag) A w 3 + (3 a 3 a 2 + a 2 a2 + 2a 2 ) (2A + A*) w 2 w* 

+ (2aia 3 + a\a 2 + a 2 + 2a 2 a 3 ) (A + 2A*) wzu* 2 + 3 (a 2 a 3 + 2aja 3 ) A *w* 3 + 0(|u>| 4 ). 
We can now see that we can eliminate all of the quadratic terms by choosing 

XX X 

ai = ~4A’ a2= 2 A’ a 3 = 4 (a _ 2A*) ’ 

which leaves us with 

w = Xw + Aiw 3 + A 2 w 2 w* + A3WW* 2 + A^w* 3 , 

where 

A\ = —2a\ ^-z + 2a 3 A^ + a 3 —i — a 2 (A + A*)|- + - 
H - — x (2 d\ — 02 ) + 3 (2 d\ ^ 2 ^ 3 ) A, 

A2 = - 2a2 ^ — i + 2aiA^ + (ai + a 2 ) ^ — i — (A + A*) ^ + — 

—2a^ ^ — i + 2asA*^ + — x (3(22 — 2a^ — 4ai) + (3aici2 + CL2 a 2 ^3) (2A + A*) , 
A3 = —2 a 3 z + 2a 3 A^ + (aj + a 2 ) ^ — z — a 2 (A + A*)|- + — 
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— 2a?; i + 2a3A*^j + —i (4a3 — 3a2 + 2ai) + (2aia3 + a\a 2 + cl 2 + 202 * 13 ) (A + 2A*) , 


A4 


-2at 


—i + 2a3A’ 


a 3 


-i — a 2 (A + A*) 


1 

8 


+ — i (02 — 203 ) + 3 ( 02*13 + * 11 * 13 ) A*. 

The next step is to make another near identity transformation, 

w = v + biv 3 + b 2 v 2 v* + b 3 vv* 2 + b^v* 3 , (13.41) 

and try to eliminate the cubic terms as well. Proceeding exactly as we did for 
the quadratic terms, although the algebra is now easier since we only retain cubic 
terms, we find that 

v = Xv + (Ai - 2A6i) v 3 + {A 2 - (A + A*) b 2 } v 2 v* + (A 3 - 2X*b 3 ) vv* 2 
+ {Al 4 + (A - 3A*) 6 4 } v* 3 + 0(M 4 ). 

At first sight, it would appear that we can eliminate all of the cubic terms in 
the same way as we did the quadratic terms. However, the coefficient of b 2 is 
A + A* = 2 fi, which is small when <C 1. Since we need b 2 = 0(1) for |u| <C 1 and 
(i < 1, we conclude that we cannot eliminate the v 2 v* term, and hence that the 
simplest normal form is 

v ~ Xv + av |u| 2 , 

using v 2 v* = v\v\ 2 , with a = A 2 (0). A little algebra then shows that a = A 2 (0) = 
3/8, consistent with our earlier calculation of a from (13.37). Indeed, the hard part 
of the proof the Hopf bifurcation is to show that a, as defined in (13.37), appears 
in the normal form in this way. If we now write v = re 16 and separate real and 
imaginary parts, we recover the normal form (13.36), and we are done. 

We should reiterate that this transformation and the ensuing algebra is not what 
needs to be done whenever the system you are studying contains a Hopf bifurcation 
point. The Hopf bifurcation theorem is far easier to use. We went through the 
details of this example purely to illustrate the steps involved in the proof. 

Figure 13.26 shows the unstable limit cycle solution for various /i < 0. We 
calculated these solutions numerically using MATLAB (see Section 9.3.4). For 
|/i| sufficiently small, we can see that the limit cycle is circular. However, as |/i| 
increases, the limit cycle moves away from the neighbourhood of the origin and 
becomes increasingly distorted as it begins to interact with a saddle point that lies 
close to (-1,-1). In fact, when fi = no « —0.12176, the limit cycle collides with 
this saddle point, and is destroyed. This is an example of a homoclinic bifurcation, 
which we discuss below. 
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Fig. 13.26. The unstable limit cycle solution of (13.36) for /r = —0.002, —0.005, —0.01, 
-0.05, -0.1, -0.12 and -0.12176. 


13.3.3 Global Bifurcations 

With the exception of the previous example, all of the bifurcations that we have 
studied so far have been local bifurcations. That is, they have arisen because of a 
change in the nature of an equilibrium point as a parameter passes through a critical 
value. A global bifurcation is one that occurs because the qualitative nature of 
the solutions changes due to the interaction of two or more distinct features of 
the phase portrait. For second order systems, the crucial features are usually limit 
cycles and the separatrices of any saddle points. Figure 13.27 illustrates an example 
of a homo clinic bifurcation, which can occur when a limit cycle interacts with 
a saddle point. A limit cycle is formed in a Hopf bifurcation when the bifurcation 
parameter, /i, is equal to ji\. As /i increases, the amplitude of the limit cycle 
grows until, when /J = ji'i > /-i \ , the limit cycle collides with the saddle point, 
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forming a connection between the stable and unstable separatrices. This is the 
homoclinic bifurcation point. With the separatrices joined up like this, the system 
is structurally unstable, since an arbitrarily small change to the equations can 
destroy the connection. For ^ > n 2 , there is no limit cycle. 

Of course, it is one thing to describe a global bifurcation in qualitative terms, 
but quite another to be able to quantify when the bifurcation occurs, since a local 
analysis is of no help. We must usually resort to numerical methods, as we did for 
the example shown in Figure 13.26. We will therefore confine ourselves to a single 
example of a global bifurcation where analytical progress is possible. 





Fig. 13.27. A homoclinic bifurcation. 


Example: Travelling waves in cubic autocatalysis (continued) 

In Section 13.1.1 we performed a local analysis of the two equilibrium points of 
(13.14). In order to determine for what values of V a physically meaningful solution 
exists, we need to determine for what values of V the unstable separatrix of Pi, 
labelled Si in Figure 13.4, enters the origin as z — > 00 . This is a global problem 
involving the relative positions of Si and the stable manifold of P 2 , labelled S 2 in 
Figure 13.4. 

We begin by defining the region 

R= |(/3,7) I 0 < (3 < -^I/< 7 <oj 

U | (A 7) I ^ < /? < !, 7h(/ 3) < 7 < 0 j , 

where 7h(/3) = — /3 2 (1 — 0)/V is the horizontal isocline. The region R is shown 
in Figure 13.28. Note that the point (3 = |, 7 = — ^ V is the local minimum of 
7h(/3). This region is constructed so that there are three qualitatively different 
possibilities. 

- Case (a): S 2 enters R through the /3-axis, 
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- Case (b): S 2 = S'i, and hence asymptotes to Pi as 2 — > —00, 

- Case (c): S 2 enters R through its lower boundary. 

Note that S 2 cannot enter R through the 7-axis, since d/3/dz = 7 < 0 there. 
Cases (a), (b) and (c) are illustrated in Figure 13.28. In case (a), S'i lies below 
S 2 and therefore does not enter the origin and is swept into the region f3 < 0. It 
cannot represent a physically meaningful solution in this case. In case (b), S 2 = S 1 
represents a physically meaningful solution, and this solution enters the origin on 
the stable manifold. In case (c), Si lies above S 2 and is therefore attracted into the 
origin on the centre manifold. These arguments can be made more rigorous, but 
we will not do this here (see Billingham and Needham, 1991, for more details). 

We now need to determine which of these three cases arises for each value of V. 
We can do this by defining a function f(V), as illustrated in Figure 13.29. 

- Case (a): f(V ) is equal to the value of /3 where S 2 crosses the /3-axis leaving the 
region R, 

- Case (b): f(V) = 1, 

Case (c): f(V ) = 1 — 70, where 70 is the value of 7 where S 2 crosses the line 
/3 = 1 (which it does, since d(3/dz = 7 < 0). 

Defined is this way, f(V) is continuous, and there is no physically meaningful 
solution of (13.14) when f(V ) < 1 (case (a)), and a unique physically meaningful 
solution when f(V) ^ 1 (cases (b) and (c)). 

Lemma 13.1 /( V) is strictly monotone increasing for V > 0. 


Proof When V = Vo > 0, we define the region 


D(V 0 ) = {(/3,7) | 0 < 7 < 7s(P)\ v = Vo , 0 < /3 < 1} , 

where 7 = 7s (/3) is the equation of S 2 within the region R , as illustrated in Fig- 
ure 13.29. From (13.17), 


A (^L\ 

dV \d/3 J 


= -1 < 0 . 


This means that the slope of the integral path through any fixed point is strictly 
monotone decreasing as V increases. In particular, when V = V\ > Vo, all integral 
paths that meet the curved boundary of D(Vq) (a boundary given by S 2 when 
V = Vo) enter D(Vq). In addition, from (13.14), all integral paths that meet the 
straight parts of the boundary of D(V 0) also enter D{Vq). Finally, since S 2 is 
directed along the vector e_ = (1, —V) as it enters the origin, it lies outside D{Vq) 
in a sufficiently small neighbourhood of the origin when V =V\. We conclude that 
when V = V\ > Vo, S 2 cannot pass through the boundary of D(Vo), and therefore 
that /(V0 > /(V 0 ). ‘ □ 


We will leave the next stage of our argument as Exercise 13.11, in which you 
are asked, helped by some hints, to show that f(V) = 0{V 2 ) for V <C 1, and that 
f(V) ~ V for V»l. Now, since f(V) < 1 for V sufficiently small, f(V) > 1 for V 
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Fig. 13.28. The region R and the three possible types of global behaviour of Si and £ 2 . 


sufficiently large, f{V) is strictly monotone increasing by Lemma 13.1, and f(V ) 
is continuous, we conclude from the intermediate value theorem that there exists a 
unique value V = V * , such that f(V ) ^ 1 for V ^ V* and f(V) < 1 for V < V*. 
When V = V* there is therefore a global bifurcation, since we have now shown that 
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Fig. 13.29. The definition of f(V) and the region D(V) in each of the three cases. 


case (a) occurs when V < V*, case (b) occurs when V = V* and case (c) occurs 
when V > V*. This global bifurcation comes about purely because of the relative 
positions of the manifolds Si and 5 2 . 

In order to determine the numerical value of V* , we would usually have to solve 
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the governing equations (13.14) numerically. However, we do know that when V = 
V* the solution asymptotes to the origin on the stable manifold with f3 = 0(e~ v z ) 
as z — > oo, whilst for V > V*, the solution asymptotes to the origin on the centre 
manifold, with f3 ^ V / z as z — > oo. We were also able to show that there is an 
analytical solution, (13.18), when V = l/\/2, which has (3 = 0(e~ z t' / ' 2 ) as z — > oo. 
This must therefore be the unique solution that corresponds to V = V* , and we 
conclude that the global bifurcation point is V = V* = l/y/2. 

In conclusion, we have shown that a unique travelling wave solution exists for 
each V > 1 / 1 / 2 , and that the solution with V = 1 / x/2 asymptotes to zero expo- 
nentially fast as z — > 00 , whilst this decay is only algebraic for V > l/i/2- This has 
implications for the selection of the speed of the waves generated in an initial value 
problem for equations (13.8), in particular that localized initial inputs of autocat- 
alyst generate waves with the minimum speed, V = l/i/2. The reader is referred 
to Billingham and Needham (1991) for further details, and to King and Needham 
(1994) for a similar analysis when the diffusion coefficient is not constant. 


Exercises 

13.1 Sketch the phase portraits of the systems 

{a)x=-x-2 y 2 , y = xy-y 3 , 

(b) x = x 2 , y=-y- x 2 , 

(c )x = y + x 2 , y = -y-x 2 , 

in the neighbourhood of the origin, including in your sketch the unstable 
and centre manifolds. 

13.2 Use arguments based on Lyapunov’s theorems to determine the stability of 
the equilibrium points of 

(a) x = x 2 - 2y 2 , y= -4 xy, 

(b) x = xy 2 — x, y = x 2 y — y. 

13.3 Consider the second order differential equation 6 + k8 + f(9) = 0 , where 
f is continuously differentiable on the interval \6\ < a, with a > 0 a 
given constant, Of (9) > 0 for 9 7 ^ 0 and /( 0) = 0. By considering a 
suitable Lyapunov function, show that the origin is an asymptotically stable 
equilibrium point. 

13.4 Euler’s equations for a rigid body spinning freely about a fixed point in 
the absence of external forces are 

Hchl — (H — 111)^2^3 = 0, 

Blo2 — (C — = 0, 

Cloq — (H — — 0, 

where A, B and C are the principal moments of inertia, and u> = (uj\, u> 2 , 
is the angular velocity of the body relative to its principal axes. 

Find all of the steady states of Euler’s equations. Using the ideas that we 
developed in Section 9.4, show that is a constant of the motion. 
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Sketch the phase portrait on the surface of the sphere uf + u)\ + wf = u>q. 
Deduce that the steady state with u>i = uq, W 2 = W 3 = 0 is unstable if 
C < A < B or B < A < C, but stable otherwise. 

Show that 

V = [{B(A - B)luI + C(A - C)cj |} + BujI + Cu>% + A (uf + 2w 0 wi)] 2 , 

is a Lyapunov function for the case when A is the largest moment of inertia, 
so that the state ui\ = loq, U 2 = W 3 = 0 is stable. Suggest a Lyapunov 
function that will establish the stability of this state when A is the smallest 
moment of inertia. Are these states asymptotically stable? Perform a 
simple experiment to verify your conclusions. 

13.5 A particle of mass m lies at r = (. x,y,z ) and moves in a potential held 
W(x, y, z ), so that its equation of motion is 

mr = — VIP. 

By writing x = u, y = v and z = w, express this equation of motion in 
terms of first derivatives only. Suppose that W has a local minimum at 
r = 0. By using the Lyapunov function 

V = W + -to ( u 2 + v 2 + w 2 ) , 

show that the origin is a stable point of equilibrium for the particle. What 
do the level curves of V represent physically? Is the origin asymptotically 
stable? 

If an additional, nonconservative force, f (u,v,w), also acts on the par- 
ticle, so that 

mr = —VIP + f , 

describe qualitatively how the stability of the point of equilibrium at the 
origin is affected. 

13.6 Consider the three systems 

(a) y = y(y-l)(y-2X), 

(b) y = y 2 + 4A 2 - 1, 

(c) y = -y(4y 2 + A 2 - 1). 

Sketch the bifurcation diagram for each system, and show that each system 
has two bifurcation points of the same type, which you should determine. 
Close to each bifurcation point, write each system in the normal form 
appropriate to the bifurcation. 

13.7 (a) Consider the system 

x = —e + fj,x — x 2 . 

When e = 0 this is the normal form for a transcritical bifurcation. 
Sketch the bifurcation diagram when e 7 ^ 0, dealing separately with 
the cases e > 0 and e < 0. In one case there are two saddle-node 
bifurcation points, in the other there are no bifurcation points. 



(b) Consider the system 
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x = jix + 2ex 2 — x 3 , 


with e ^ 0. When e = 0 this is the normal form for a supercriti- 
cal pitchfork bifurcation. Sketch the bifurcation diagram when e is 
small, but nonzero. 

What can you deduce from the answers to the two parts of this question? 

13.8 Sketch the phase portraits when /j > 0, ft = 0 and n < 0 for the normal 
forms of (a) the transcritical and (b) the supercritical pitchfork bifurcation, 
given by (13.34) and (13.35). 

13.9 Give an informal proof of the transcritical bifurcation theorem and the 
pitchfork bifurcation theorem. 

13.10 Consider the system x = fix — uiy + x 3 , y = ux + fjy + y 3 . Use the Hopf 
bifurcation theorem to determine the nature of the Hopf bifurcation at 
H = 0. Use a near identity transformation to write this system in normal 
form, and confirm that this is consistent with the Hopf bifurcation theorem. 
Use MATLAB to solve this system of equations numerically. Plot how the 
limit cycle solution changes with /j, and determine for what range of values 
of fi it exists. 

13.11 (a) Seek an asymptotic solution of (13.17) subject to the boundary con- 

dition (13.15), valid when V 1, by rescaling 7 = Vy and using an 
asymptotic expansion 

m = 7o(/3) + U- 2 7i(/3) + 0(V~ 4 ). 


Hence show that f(V), as defined in Section 13.3.3, satisfies 

f(V) = V+1- -V- 1 + 0{V~ 3 ) for V > 1. 

6 

(b) Repeat part (a) when V <C 1. In this case, you will need to seek 
a rescaling of the form f3 = </>(U)/3, 7 = ip(V) 7, and determine q . i 
and if) by seeking an asymptotic balance in (13.17) and (13.15). You 
should find that, at leading order, 


dy P 2 

— = — 1 — — , subject to 7 
d(3 7 


— (3 as P — > 0. 


Integrate this equation numerically using MATLAB, and hence show 
that 


f{V)~P*V 2 as U — > 0, 

where P* is a constant that you should determine numerically. 
13.12 Project Consider the CSTR system that we studied in Section 13.3.1, but 
where, in addition, the autocatalyst is itself unstable, and breaks down to 
form the final product C through the chemical reaction 


B — » C, rate k 2 b. 
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(a) Show that the concentrations of A and B now satisfy the equations 


da 

dr 


— a/3 2 + 


1 — a 

7~res 


(E13.1) 


d/3 

dr 


a(3 2 + 


/?o — /3 

Tres 


l 

5 

T2 


(E13.2) 


where T 2 is a dimensionless constant that you should determine. 

(b) Using the ideas that we developed in Section 13.3.1, show that the 
steady state solutions are again given by the points of intersection 
of a cubic polynomial and a straight line through the origin. With- 
out making any quantitative calculations, sketch the position of the 
steady states in the (r res , z)-plane, firstly when there is no autocat- 
alyst in the inflow, and secondly when there is. 

(c) Now restrict your attention to the case where there is no autocatalyst 
in the inflow. Determine the range of values for which there are 
three steady state solutions. Show that the smallest of these steady 
states is stable and that the middle steady state is a saddle point. 
Show that the largest steady state loses stability through a Hopf 
bifurcation, whose location you should determine. Use the Hopf 
bifurcation theorem to determine when this is supercritical and when 
it is subcritical. 

(d) Use MATLAB to integrate (E13.1) and (E13.2) numerically, and 
hence investigate what happens to the limit cycle that forms at the 
Hopf bifurcation point. Draw the complete bifurcation diagram, 
indicating the location of any limit cycles. What advice would you 
give to an engineer trying to maximize the output of C from the 
CSTR? Hint: For some parameter ranges, there is more than one 
limit cycle. 



CHAPTER FOURTEEN 


Time-Optimal Control in the Phase Plane 


Many physical systems that are amenable to mathematical modelling do not exist 
in isolation from human intervention. A good example is the British economy, 
for which the Treasury has a complicated mathematical model. The state of the 
system (the economy) is given by values of the dependent variables (for example, 
unemployment, foreign exchange rates, growth, consumer spending and inflation), 
and the government attempts to control the state of the system to a target state 
(low inflation, high employment, high growth) by varying several control parameters 
(most notably taxes and government spending) . There is also a cost associated with 
any particular action, which the government tries to minimize (some function of, 
for example, government borrowing and, one would hope, the environmental cost 
of any government action or inaction). The optimal control leads to the economy 
reaching the target state with the smallest possible cost. 

Another system, for which we have studied a simple mathematical model, con- 
sists of two populations of different species coexisting on an isolated island. For the 
case of two herbivorous species, which we studied in Chapter 9, we saw that one 
species or the other will eventually die out. If the island is under human manage- 
ment, this may well be undesirable, and we would like to know how to intervene to 
maintain the island close to a state of equilibrium, which we know, if left uncon- 
trolled, is unstable. We could choose between either continually culling the more 
successful species, continually introducing animals of the less successful species or 
some combination of these two methods of control. Each of these actions has a cost 
associated with it. 

These are examples of optimal control problems. Optimal control is a huge topic, 
and in this short, introductory chapter we will study just about the simplest type of 
problem, which involves linear, constant coefficient ordinary differential equations. 
These are, however, extremely important, as it is often necessary to control small 
deviations from a steady state, for example when steering a ship, for which linear 
equations are a good approximation. We will also restrict our attention to time- 
optimal control, for which the cost function is the time taken for the system to reach 
the target state. We wish to drive the system to the target state in the shortest 
possible time. 

The crucial results that we will work towards are the properties of the controlla- 
bility matrix and the application of the time-optimal maximum principle. Although 
we will give proofs of the various results that we need, the main thing is to know 
how to apply them. 
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14.1 Definitions 

Let’s begin by formalizing some of the ideas that we introduced above. Consider 
the system of ordinary differential equations 

x = f(x,u,f), subject to x(0) = x°. 

Note that we use the superscript 0 to indicate the initial state and the superscript 1 
to indicate the final state. We say that x(i) is the state vector, whose components 
x = . . . , x n (t)) are the state variables, and u is the control vector, 

whose components u = (ui(t), u 2 (t), . . . ,u m (t )) are the control variables. We 
assume that f is continuously differentiable with respect to x, u and t, but that u is 
merely integrable, so that we can allow for discontinuous changes in its components, 
the control variables. As we have seen, these conditions guarantee the existence 
and uniqueness of the solution for a given control u. 

A control problem takes the form of a question: if x = x° when t = 0, can we 
choose the control vector u(t) so that x = x 1 , the target state, when t = t \7 In 
other words, can the system be controlled from x° to x 1 , reaching x 1 when t = ti? 
If there is a cost function, 

J= [ 5o(x(f),u(f),f)df, 

Jo 

associated with the control problem, such that we seek controls u for which J is a 
minimum, we have an optimal control problem. If (j t} = 1 , and hence J = t \ , 
so that we seek to minimize the time taken to reach the state x 1 , we have a time- 
optimal control problem. This is the type of problem that we will be studying. 
In particular, for first order, linear, constant coefficient equations, 

dx . „ . . 

— = Ax + Bu{t ), 

and for second order, linear, constant coefficient equations, 
dx-\ 

= Anxi + A12X2 + B n ui(t) + B\2U2(t ), 

= A21X1 + A22X2 + B2\U\{t) + B22U2X). 

We can write this more concisely using matrix notation as 

dx , „ 

— = Ax + Bu, 
at 

where A and B are constant, 2 x 2 matrices. Note that there are at most n 
independent control variables for an n th order linear system of this form. 

14.2 First Order Equations 

A good way of introducing many of the important, basic concepts of control theory 
is to study one-dimensional systems, in other words, first order, linear ordinary 
differential equations with constant coefficients. 
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Example: The tightrope walker 


A tightrope walker is inherently unstable. Walking a tightrope is rather like trying 
to balance a pencil on its tip. Although the walker has an equilibrium position, 
it is unstable, and she must push herself back towards the vertical to stay up- 
right. Moreover, if her deviations from the vertical become too large, they cannot 
be controlled, and she falls off the tightrope. This is typical of many types of 
problem where the idea is to control small deviations from equilibrium as quickly 
as possible - linear, time-optimal control. 

A very simple model for the tightrope walker is 

dx 

— =x + u, (14.1) 


where x represents her angular deviation from the vertical and u her attempts to 
control her balance. Let’s assume that x = x° > 0 when t = 0. Can this deviation 
from the vertical be controlled at all? If so, how quickly can it be controlled, and 
what should the control be? If there is no attempt to control the deviation, u = 0, 
x = are 4 and the tightrope walker falls off. The upright position, x = 0, is an 
unstable equilibrium state. For a general control u(t), we can solve (14.1) using the 
integrating factor e -4 , so that 

^(e”V) = e" 4 rt(f), 


and hence 

x = x°e t + e 4 f e~ T u(r) dr (14.2) 

Jo 

Since we want to control the system to x = 0 when t = t\ , we must have 

/• 4i 

x°+ e~ T u(r) dr = 0. (14-3) 

Jo 

Clearly, since x° > 0, u must be negative for at least some of the period 0 < 
t ^ t\. Now, if u(t) is not bounded below (the tightrope walker can apply an 
arbitrarily large restoring force), it is easy to choose u to satisfy (14.3), for example 
with u(t) = —e t x°/ti. This is unrealistic, since we can control the system in an 
arbitrarily short time t\ by making |tt| sufficiently large. The time-optimal control 
problem is meaningless if the control is unbounded. From now on we restrict our 
attention to bounded controls with —1 ^ u(t) ^ 1. We will see later how to scale 
a slightly more realistic problem so that u lies in this convenient range. In general, 
the components of a bounded control vector satisfy — 1 < Ui(t) < 1. 

Intuitively, we can see that to push x back towards equilibrium as quickly as 
possible, we need to push in the appropriate direction as hard as we can by taking 
u(t) = — 1, so that (14.3) gives 


x°- 


’ dr = 0, 


and hence 


ti = -log(l - x°), 


(14.4) 
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with solution 


x(t) = l + (x°- l)e 4 . (14.5) 

For t > t\ we take u = 0, and the system remains in its equilibrium state, x = 0. 
Equation (14.4) shows that t\ — > oo as x° — > 1~. If x° ^ 1, the system cannot 
be controlled back to equilibrium. The tightrope walker cannot push hard enough, 
and she falls off. Similarly, if x° < 0, the time-optimal control is u(t) = 1, pushing 
in the other direction. Some time-optimal solutions are shown in Figure 14.1. 



Fig. 14.1. Some time-optimal solutions, (14.5), of the tightrope walker problem. 

We can now make a couple of definitions. The controllable set at time t\ 
is the set of initial states that can be controlled to the origin in time t\. For the 
tightrope walker problem, this is the set 

C(t i) = {;c | |a?| < 1 — e^* 1 } . 

The controllable set is the set of initial states that can be controlled to the origin 
in some finite time. This is just the union of all the controllable sets at time t\ ^ 0. 
For the tightrope walker problem, the controllable set is 

C= [J C(ii) = {x I M < 1}. 

ti^O 

If C is the whole real line, C = R, we say that the system is completely control- 
lable. The tightrope walker system is not completely controllable, since C / R 
(see Exercise 14.1). This is a good point at which to define the reachable set. If 




14.2 FIRST ORDER EQUATIONS 


421 


x = x° when t = 0, the reachable set in time t\, R(ti,x°), is the set of points x 1 
for which there exists a control u(t) such that x(t i) = x 1 . From (14.2), 


x 1 = x°e tl 


e T u(r) dr, 


and hence, since \u(t)\ < 1, 


\x 1 e 4l — cc° I = 


e t, u(t) dr 


^ / e T |u(r)|dr^ / e T dr = 1 — e 
Jo Jo 


so that 


(a; 0 - l)e 41 + 1 < x 1 < (a; 0 + l)e 41 - 1, (14.6) 

defines the points in the reachable set, i?(<i,a: 0 ). For |x°| > 1, the reachable 
set does not contain the origin, whilst for |x°| < 1 the origin is reachable for 
t\ ^ — log(l — |ar|), consistent with what we know about the controllable set. 
The reachable set is shown for two different cases in Figure 14.2. Note that the 
boundaries of the reachable set are given by the solutions with the controls u{t) = 
±1, a fact that will prove to be important later. 


|x°| > 1 



Fig. 14.2. The reachable set for the tightrope walker problem lies between the curved 
lines. 





422 


TIME-OPTIMAL CONTROL IN THE PHASE PLANE 


14.3 Second Order Equations 

For second order, linear, constant coefficient systems, we can examine the behaviour 
of the state variables, x\ and X 2 , in the (x\, X 2 )-phase plane (see Chapter 9). We 
will show that our intuitive notion that only the extreme values of the bounded 
control are used in the time-optimal control is correct. This is known as bang- 
bang control. Before we can do this, there are a few mathematical ideas that we 
must consider. All of the following can be generalized to systems of higher order, 
and some of it to nonlinear, nonautonomous systems. 


14.3.1 Properties of sets of points in the plane 

A convex set, S, is one for which the line segment between any two points in S 
lies entirely within S (see Figure 14.3 for some examples). Note that necessary 
but not sufficient conditions are that S must be both connected (any two points 
in S can be joined by a curve lying within S) and simply-connected {S has no 
holesf). Formally, if S is a convex set and x,y € 5, a+(l- c)y € S for all c 
such that 0 < c < 1. 


CONVEX 



NONCONVEX 



Fig. 14.3. Some examples of convex and nonconvex sets of points in the plane. 


f More formally, S is connected and any closed loop lying in S can be shrunk continuously to a 
point without leaving S. 



14.3 SECOND ORDER EQUATIONS 


423 


An interior point of S' is a point x £ S for which there exists a disc of points 
centred on x, all of which lie in S (see Figure 14.4). 

The interior, Int(S), of a set S is the set of all the interior points of S. 

An exterior point of S is a point in the interior of S c , the complement of S. 
The exterior, Ext(S), of a set S is the set of all the exterior points of S. 

A boundary point of S is a point, not necessarily in S, that lies in neither the 
interior nor the exterior of S (see Figure 14.4). Note that all discs centred on a 
boundary point of S contain a point that is not in S. 

The boundary of a set S is the set of all boundary points of S, and can therefore 
be written as 


dS = {x | (x ^ Int(S))} fl {x (x ^ Int(S' c ))} . 


A boundary point 



Fig. 14.4. Examples of interior and boundary points of S, and discs centred on them. 


If S does not contain any of its boundary points, it is said to be open. For 
example, the set of points with |x| < 1 (the open unit disc) is open. Note that 
the boundary of the open unit disc is the unit circle, |x| = 1. An open set has 
S = Int(S). ' 

A closed set contains all of its boundary points. For example, the set of points 
with |x| < 1 (the closed unit disc) is closed. 

A set S is strictly convex if, for each pair of points in S, the line segment joining 
them is entirely made up of interior points (see Figure 14.5). For a strictly convex 
set, the tangent to any boundary point does not meet S at any other boundary 
point. If S is convex, but not strictly convex, some of the tangents to the set will 
meet the boundary at more than one point along a straight part of the boundary 
(see Figure 14.6). 
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Fig. 14.5. Examples of strictly convex and convex, but not strictly convex sets. 


Tangents meet the set 



Fig. 14.6. Examples of tangents to convex and strictly convex sets. 


14.3.2 Matrix solution of systems of constant coefficient ordinary 
differential equations 

Consider the system of n differential equations 

c/x 

— = Ax + b(f), subject to x(0) = x°, (14.7) 

at 

with A an n x n matrix of constants. In order to be able to write the solution of 
this equation in a compact form, we define the matrix exponential of At to be 

exp (At) = (14.8) 

fc= o 

with A 0 = /, the unit matrix. Note that this power series is convergent for all A 
and t. We can see immediately that 

rl .°° A k t k ~ l 

- ex p(/lf) = £ ^ -=A ex p(M), 

k = 1 v ' 

a property that the matrix exponential shares with its scalar counterpart, e at . 
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Now consider the product 

P(t) = exp(At) exp (—At). 

Firstly, we note that P( 0) = I. Secondly, using the product rule, 

dP 

— — = A exp(At) exp(— At) — exp(At)Aexp(—At) = 0, 
dt 

using the fact that, from the definition, (14.8), 

Aexp(Af) = exp{At)A. 

Similarly, d n P/dt n = 0, and hence from its Taylor series expansion, P(t) = I. In 
other words, 

{exp(At)} -1 = exp(— At), 

again, in line with the result e at e~ at = 1 for the scalar exponential function. Note 
that, in general, exp(A) exp(P) ^ exp(P) exp(A) unless AB = BA (see Exer- 
cise 14.4). 

These results mean that when b (t) = 0, so that (14.7) is homogeneous, the 
solution can be written as 

x = exp(Af)x°. 

When the equation is inhomogeneous, we can use a matrix integrating factor to 
write 

expf—At)^ — exp(— At) Ax = {exp(— At)x\ = exp(— At)b, 

dt dt 

and hence, 

x = exp(Af) |x° + J exp(— ^4r)b(T) dr | . (14.9) 

This is the generalization to an n th order system of the first order solution, an 
example of which is given by (14.2). 

Finally, note that we know from the Cayley-Hamilton theorem (Theorem Al.l) 
which states that every matrix satisfies its own characteristic equation, that, for 
an n x n matrix A and k ^ n, A k can be written as a linear combination of 
/, A, A 2 , . . . , A n ~ 1 , and hence so can exp(Af). In particular, for a 2 x 2 matrix, 
exp(At) is a linear combination of I and A. 


Example: Simple harmonic motion 
Simple harmonic motion with angular frequency u> is governed by 


d 2 x 

dt, 2 


T co x — 0. 


We can write this as a system of two first order equations in the usual way as 


±i = x 2 , x 2 = -w 2 aq, 
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with x = X\. In terms of the generic equation (14.7), 


We now note that 


This means that 


and hence 


-w 2 0 




A 2n = u 2n (-l) n I, A 2n+1 = w 2n (-l)"4, 


°°^ kj.k °°^ j^2n£2n °°^ ^2n+1^.2n+l 

exp (At) = ^2 M = E (OnA + E (Or, I IV 


^ (2n)! ^ (2n + 1)! 

n — 0 v ' n — 0 v ' 


= / E 


~ (-1 )« W 2 ”f 2 ” ~ (-l)" W 2rl t 2 ” +1 

(2?i)! (2?r + l)! 

n =0 v ' n = 0 v ' 


. r , 1 . , , / cos cot — sin cot 

= cos Loti H sin cot A = w 

w V — wsinwt cos cot 


With the initial condition cci(O) = a;?, £ 2 ( 0 ) = x$, this means that the solution is 


: = x° exp (At) = x ° cos ^ sin ^ 

\ —cox\ sin cot + x% cos cot 


14.4 Examples of Second Order Control Problems 

Example 1: The positioning problem 

Consider the one-dimensional problem of positioning an object of mass m in a 
frictionless groove using a bounded applied force F(t) such that —F max < F(t) < 
F max . Newton’s second law gives 


F(t) = m— - 7 T, subject to x(0) = X, i(0) = 0, x(ti) = 0, x(ti) = 0, 
dt z 

minimizing t\. If we let 


F max F max . Fit) n F max 

X\ = X, X2 = X, U = — , Xi = A, 

m m .r max ra 


we have a problem in the standard form, with |u| ^ 1, 

dx± dx 2 . . q , 

— = x 2 , — = u, x 1 {0)=x 1 , a? 2 (0) = 0. 


(14.10) 
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In terms of our matrix notationf 



Let’s now think what the optimal control might be. We want to push the particle 
to the origin as quickly as possible, but it must be at rest when we get it there. 
Presumably we should push as hard as we can for some period of time, 0 ^ t ^ T, 
then decelerate as strongly as we can by pushing as hard as possible in the opposite 
direction for T ^ t ^ t\, so that the particle is at rest at the origin when t = t\. 
We will prove later that this is indeed the optimal method of control, but for now 
let’s just construct the solution. 

Assuming that x® > 0, we take u = — 1 for 0 ^ t ^ T. We can easily integrate 
(14.10) and find that 

X\ = Xi — ^ t 2 , X 2 = —t for 0 < t ^ T. (14.11) 

Then we take u = 1 for T ^ t ^ ti, and use x\ = X® — \T 2 , x^ = — T when t = T 
to fix the constants of integration, which gives us 

X\ = ^t 2 — 2 Tt + T 2 + X®, X 2 = t — 2T for T ^ t < t\. (14.12) 

Now we just need to find T, the time at which we have to switch from accelerating 
as hard as possible to decelerating as hard as possible, and t\, the total time taken 
to control the particle to the origin. We can obtain this by using X\ = X 2 = 0 when 
t = t\. This gives T = \t\, so that the periods of acceleration and deceleration are 
equal, and t\ = 2y / a^. The full solution is 

( X®— \t 2 for 0 ^ t < \J Xj, 

•'Cl = \ (14.13) 

[ \t 2 — 2 + 2x\ for \fx < t ^ 2^/x,, 


f —t for 0 < t < v^i> 

*2 = l ( 14 . 14 ) 

(t — 2^/x^ for \J x^ ^ t ^ 2y/x^. 

A typical solution is plotted as a function of t, and various solutions plotted in 
the (xi . a> 2 )-phase plane in Figure 14.7. We will return later to the problem of 
determining the time-optimal control if the particle is not initially stationary, so 
that £ 2 ( 0 ) yf 0. 


Example 2: The steering problem / The positioning problem with friction 

The forced motion of a ship is unstable, and tends to drift off course if it is not 
controlled. If X\ represents the deviation of the ship from a straight path, we can 

f Note that, strictly speaking, the matrix B should have another column of zeros, since there is 
no dependence on a second control function in this problem. We have omitted this for clarity. 
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Fig. 14.7. Some time-optimal solutions of the positioning problem when the particle is 
initially stationary. 


write a simple model as 


dx i dx 2 

dt X2 ’ dt 


—qx 2 + u. 


(14.15) 


The drag force caused by the resistance of the water to the lateral motion of the 
ship is represented by the term —qx 2 , with q a positive constant. This is exactly 
the same as the positioning problem, but in a groove with friction. In terms of our 
matrix notation, 


A = 





Example 3: Controlling a linear oscillator 

The equation for a linear oscillator (simple harmonic motion) of unit angular fre- 
quency, subject to an external force, u(t), can be written as 


dx 1 
dt 


dx 2 

x 2 , — = -Xi+U. 

dt 


In terms of our matrix notation, 


A = 


0 1 

-1 0 



(14.16) 
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If the oscillator is not initially at rest and we wish to bring it to a halt as quickly 
as possible, we must solve (14.16) subject to 

^i(O) = x°, x 2 (0) = x°, x\{t\) = x 2 {tf) = 0, (14.17) 

minimizing t\ . We can think of this as the problem of stopping a child on a swing 
as quickly as possible. 


Example The positioning problem with two controls 
Suppose that in the positioning problem we are also able to change the velocity of 
the particle through an extra control, so that 


dx i 
dt 


= x 2 + Ui, 


dx 2 

dt 


= u 2 . 


(14.18) 


In terms of our matrix notation, 
A = 


B = 


With the exception of the steering problem (see Exercise 14.9), we will solve all 
of these problems below after we have studied some more of the theory relevant to 
this type of control problem. 


14.5 Properties of the Controllable Set 

Apart from determining the time-optimal control, we often want to construct C{t\), 
the controllable set for any time t\. There are a number of statements that we can 
make about the geometry of these sets, and of C, the controllable set. 

- If t\ < t 2 , C(ti) C C{t 2 ). In other words, the controllable set never gets smaller 
as t increases. Any state controllable to zero in time t\ is also controllable to 
zero in time t 2 > t\ . 

- C(t) and C are connected sets. 

C is open if and only if 0 € Int(C). 

These results hold for nonlinear, nonautonomous systems of ordinary differential 
equations. For linear, constant coefficient equations, we also have that C(t) and C 
are symmetric about the origin and convex. Note that this also implies that these 
sets are simply-connected, since all convex sets are simply-connected. 

Theorem 14.1 If t\ < t 2 , C(ti) C C(t 2 ). 

Proof Let x° be a point in C(t\), with control u = v(<). If we apply the control 

_ ( v(t) for 0 < t < t\, 

U \ 0 for <i < t < t 2 , 

the trajectory reaches x = 0 when t = t\, then remains there for t\ ^ t ^ t 2 , since 
x = 0, u = 0 is an equilibrium state. Therefore x° £ C(t 2 ), which means that 
C(ti) C C(t 2 ) (see Figure 14.8). □ 
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C(t 2 ) 



Fig. 14.8. If fi < t2 , C(ti) C Cfa). 


Theorem 14.2 If x° £ C{tf) and y is a point on the trajectory from x° to 0, 
y £ C(t\). In other words, all points on controllable trajectories are controllable. 

Proof Let x = X(f) be the trajectory containing both x° and y, with control u(i). 
When t = n, X(n) = y, and when t = r 2 , X(r 2 ) = 0, with n < r 2 < ti (see 
Figure 14.9). If we now consider the solution with control v(f) = u(t + t\) and 


C(tj) 



Fig. 14.9. All points on controllable trajectories are controllable. 


initial condition x(0) = y, the system follows the same trajectory, x = X(f + n), 
and reaches x = 0 when t = r 2 — ri. Therefore y G C(t 2 — ri) and, by Theorem 14.1, 
y s C(fi). □ 
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Theorem 14.3 C{t\) and C are connected sets. 


Proof If x° G C(ti) and y° G C(t±), there are, by definition, trajectories that 
connect each point to the origin (see Figure 14.10). By Theorem 14.2, all points on 



Fig. 14.10. C(ti) and C are connected sets. 


each of these trajectories lie in C(t\). Therefore the union of these two trajectories 
is a curve made up of points in C(t\) that connects x° and y , and hence, by 
definition, C(t\) is connected. Since C = |J ;i>0 C(fi), C is also connected. □ 

Theorem 14.4 C is open if and only if 0 G IntfC). 

Proof If C is open, all of its points are interior points, so clearly 0 G Int(C). It is 
less straightforward to prove that 0 G Int(C) implies that C is open. 

If 0 G Int(C), by definition there is a disc of radius r centred on 0, which 
we write as D(0,r), that lies entirely within C. Now suppose that u = v(f) is 
a control that steers some point x° to 0 in time t\. Let D(x°,r°) be a disc of 
radius r° centred on x , and let y° be another point within this disc, as shown in 
Figure 14.11. By continuity of the solutions of the underlying differential equations, 
if r° is sufficiently small, the control v(<) steers y° into the disc D(0,r) on a path 
y(f) with y(fi) G D(0,r) at some time t\. Since D(0,r ) G C, we can also find a 
control v(f) that steers y(ti) to 0 in some time 1 2 - Therefore y° can be controlled 
to the origin in time t\ + ^2 using the control 

_ / v(i) for 0 < t < ii, 

\ v(f) for t\ < t < < 2 - 

Therefore y° G C(t\ + £ 2 ) C C, and hence, for r° sufficiently small, D(x°,r° ) G C 
for all x° G C. By definition, C is therefore open. □ 
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Theorems 14.1, 14.2, 14.3 and 14.4 all hold for nonlinear, nonautonomous sys- 
tems of ordinary differential equations. We now focus on linear, constant coefficient 
equations. Recall that if 

dx , „ 

— = Ax + B u, 
at 

with A and B constant matrices, the solution is 

rt 


x 


= exp(Tf) |x° + J exp(— At)Bvl(t) . 


This means that x° £ C(t\) if and only if there is a control u(f) such that 


x u = — / exp(— At)B\i{t) dr. 

Jo 


(14.19) 


Theorem 14.5 C(t.\) is symmetric about the origin and convex. 


Proof If x° £ C(ti) with control u(t), (14.19) shows that — x° £ C(fi) with control 
— u(f).f Therefore C(ti) is symmetric about the origin. 

Now note that the set of bounded controls 

U = {u (f) | -1 < Ui(t) < 1} 

is convex, since if u° £ U and u 1 £ U , cu° + (l — c)u 1 £ U for 0 ^ c ^ 1. Therefore, 
if u° and u 1 are controls that steer x° and x 1 to the origin in time t \ , 

i-ti 

cx° + (1 — c)x 1 = — exp (-At)B |cu°(r) + (1 — c)u 1 (r)} dr, 

Jo 

f Note that the fact that the control variables are scaled so that — 1 ^ Ui(t) ^ 1 is crucial here. 
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and hence the control cu°(r) + (1 — c)u 1 (r) £ U steers cx° + (1 — c)x 1 to the origin 
in time t\. The line segment that joins x° to x 1 therefore lies entirely within C(t,\), 
and hence, by definition, C(t\) is convex. □ 

Theorem 14.6 C is symmetric about the origin and convex. 

Proof A union of symmetric sets is symmetric, so C = (J ;i>0 C(ti) symmetric. 
Although a union of convex sets is not necessarily convex (see Figure 14.12), since 
Theorem 14.3 tells us that C(t\) C C{t 2 ) for t\ < t 2 , C is a union of nested convex 
sets and therefore is itself convex. □ 



Fig. 14.12. A union of convex sets is not necessarily convex. 


14.6 The Controllability Matrix 

For a second order system, we define the controllability matrix to be 

M = [B AB], (14.20) 

We will show that the system is completely controllable (C = R 2 ) if and only if 

(i) rank(M) = 2, 

(ii) all the eigenvalues of A have zero or negative real part. 

Although we will not do so here, it is straightforward to generalize this result to 
n th order systems. 


Example: The positioning problem 
For the positioning problem 



and hence 


M = 



1 

0 


Clearly, rank(M) = 2, since its columns are linearly independent. Also, since 

clet(A - XI) = A 2 , 
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the matrix A has the repeated eigenvalue zero. We conclude that the system is 
completely controllable. All of the other examples that we described in Section 14.4 
are also completely controllable (see Exercise 14.6). 

Let’s now prove that these properties of the controllability matrix do determine 
whether or not a second order, linear, constant coefficient system is controllable. 

Theorem 14.7 0 € Int(C) if and only if rank(M) = 2. 

Proof Suppose that rank(M) < 2. If rank(M) = 1, there is a single direction, y, 
orthogonal to every column of M. This means that y T B = y T AB = 0, and hence 
by the Cayley-Hamilton theorem, 

y T A k B = 0 for k = 0,1,2,... . 

This means that y T exp(— At)B = 0. Now, if x° £ C(t i), (14.19) shows that 

rti 

y T x° = y • x° = — / y T exp(— Ar)Bu(T)dT = 0. 

Jo 

Therefore if rank(Af) = 1, x° lies on the straight line through the origin perpen- 
dicular to y, which is a closed set, and hence 0 ^ Int(C). If rank(M) = 0, M has 
only zero entries, and hence so does B. There are therefore no controls, C = {0}, 
and hence 0 ^ Int(C). 

Now suppose that 0 ^ Int(C). Since C{t\) C C, 0 ^ Int(C(fi)) at any time t\. 
Since 0 € C[t \ ), the origin must be a boundary point of C(t\). Since C(ti) is convex 
(Theorem 14.5), there is a tangent to C{t i) through 0 with outward normal z, and 
for all x° £ C(ti), z T x° = z • x° ^ 0 (see Figure 14.13). Equation 14.19 then shows 



Fig. 14.13. The tangent and outward normal to the set C(ii) at the origin. 
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that 

z T exp(— At)Bvl(t) dr ^ 0 
for all controls u. However, since — u is also an admissible control, 

z T exp(— At)Bvl(t) dr ^ 0, 

and hence 

z T exp(— At)Bu(t) dr = 0 for 0 < t < t\. 

Since this must hold for all controls u, 

z T exp(— At)B = 0 for 0 ^ t ^ t\. 

If we now set t — 0 we get z T B = 0, and if we differentiate and put t = 0, 
z T iB = 0. This means that z is orthogonal to all of the columns of M, and hence 
rank(M) <2. □ 





Since the system can only be completely controllable if 0 £ Int(C), this theorem 
shows that a necessary condition for the system to be completely controllable is 
that rank(M) = 2. 


Theorem 14.8 If rank(M) = 2 and Re (A,;) < 0 for each eigenvalue Aj of A, the 
system is completely controllable. 


Proof We proceed by contradiction. Suppose that rank(M) = 2 and A has eigen- 
values with zero or negative real parts, but that C ^ K 2 . Consider a point y ^ C. 
There is then a tangent to C, with equation n • x = p, that separates y from each 
x° £ C, with n • x° ^ p and n • y ^ p (see Figure 14.14). Let z = n {exp(— At)B}. 
Because rank(M) = 2, z ^ 0 for 0 ^ t ^ t\. Now choose a control with components 
Ui(t) = — sgn (zi(t)), so that 

rt. 

n • x° = — / n T exp(— At)Bu(t) dr 
Jo 



z(t)u(t) dr 



(Nil + N )dr. 


By choosing an appropriate coordinate system, we can make each component of z 
a sum of terms proportional to e~ XiT . If any eigenvalue has zero part, the corre- 
sponding component of z will be either a polynomial or a periodic function of t. 
In each case, f* 1 (|^i| + \z 2 \) dr — > oo as t\ — > oo. In particular, for t\ sufficiently 
large, n • x° > p. This is a contradiction, and we conclude that C = R 2 . □ 


Theorem 14.9 If rank(M) = 2 and A has at least one eigenvalue with positive 
real part, the system is not completely controllable. 
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Fig. 14.14. A tangent to the set C that separates each x° £ C from y g C. 

Proof Let A be an eigenvalue of A with positive real part, and e the associ- 
ated eigenvector, so that e T A = Ae T , and e T A k = A fc e T . This means that 
e T exp(— At) = e -Ar e T , and hence 

e _Ar e T i3u(T) dr. 

This integral converges as t\ — > oo, and is bounded above by some constant c, so 
that e • x° ^ c. The controllable set therefore lies on one side of a line in the plane, 
and hence C ^ R 2 . □ 

This concludes our proof that the controllability matrix has the properties that 
we outlined at the start of this section. 


t o 

e x = e 


A i 


14.7 The Time-Optimal Maximum Principle (TOMP) 

Consider the set reachable from x° in time t, R(t,x°). As time increases, this set 
traces out a volume in {x\, X 2 , t)- space, which we label RT(t, x°). If the system can 
be controlled to the origin, the shortest time in which this can be achieved is t* , 
where t* is the first time when t* € R(t*,x ° ) (see Figure 14.15). 

Theorem 14.10 The time-optimal trajectory lies in the boundary, dRT(t,,x°). 

Proof Let u = u*(t) be the optimal control for x(0) = x°, and let y (<) be a solution 
with u = u*. Now suppose that y(ffi) € Int(.R(fo, x °))- There must therefore be 
a disc of sufficiently small radius r, D(y(t 0 ),r), that lies entirely within i?.(t 0 ,x°). 
If we now apply the optimal control, u*, to all of the points within D(y(to),r), 
they will lie in the neighbourhood of y(fi) when t = t\ > to, and must also lie in 
f?(ti,x°). Therefore y(fi) € Int(i?(£i, x 0 )), so that any trajectory that starts in 
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Fig. 14.15. The reachable set, RT(t,x°). 


x 0 )) remains there. Since the origin lies in the boundary, 1 dRT(t,x°), the 
time-optimal trajectory must lie entirely within this boundary. □ 

Theorem 14.11 (The time-optimal maximum principle (TOMP)) The con- 
trol u(t) is time-optimal if and only if there exists a nonzero vector, h, such that 
forO^t^ ti, 

h T {exp(— At)Bu(t)} = sup h T {exp(— At)Bv(t)} . 

v(i) 

The components of the time-optimal control are 

Ui(t) = sgn [h T {exp(-At)B}] . for i= 1,2,... ,m 
- bang-hang control. 

Proof Let u(t) be a control that steers x from x° when t = 0 to x 1 £ dR(ti,x°) 
when t = t\. The reachable set is convex (this can be proved in the same way that 
we proved that the controllable set is convex in Theorem 14.5), so there is a tangent 
at x 1 with normal n such that 


n • x = sup n • y . 

yientti.xO) 

If v(t) is an arbitrary control and y (t) the corresponding solution, 


y 1 = exp(^4ti) < x° + / exp(— At)Bv(t) dr > , 
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and y 1 = x 1 when v = u. This means that 


n • exp(Afi)x u + n • |cxp(Afi) J exp(— At)Bu(t) dr 


= sup 

V 

and hence 


n • exp(Afi)x° + n • |exp(Afi) J exp(— At)Bv(t) dr 


{n T exp(Afi)} / exp(— Ar)5u(r) dr 


= sup {n T exp(^4t!)} / exp(— At)Bv(t) dr. 
V Jo 

If we let h T = n T exp^ty), we find that 


h T exp (-At)Bv(t) dr > . (14.21) 


h T exp(— At)Bu(t) dr = sup 


Note that h is nonzero, because exp (At) is a nonsingular matrix since it always has 
an inverse, exp(— At). Since (14.21) holds for all t\ > 0, we must have 

h T exp(— At)Bu.{t) = sup |h T exp(— At)Bv(t)} . (14.22) 

V 

All of the steps of this argument can be reversed, and we have therefore proved the 
first part of the theorem. 

To obtain the maximum value of the right hand side of (14.22), we must take 

Vi(t) = sgn [h T exp(— At)B\ . for 0 < t < t\. 

In other words, each component of the time-optimal control must take one of its 
extreme values, 1 or — 1, and change when [h T exp(— At)B^ changes sign - bang- 
bang control. □ 

We shall see that this is the main result that we need to solve time-optimal control 
problems in the phase plane. 


Example: The positioning problem 


For this problem 


Since A 2 = 0, 


and 


A = 


0 1 
0 0 


B = 


exp(— At) = I — At = 

exp {—At)B = 


1 -t 

0 1 


— t 
1 
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Therefore, if we write h T = (cq/3), we find that 

h T exp(— At)B = f3 — at. 

This means that the time-optimal, bang-bang control changes sign at most once. 
The TOMP does not tell us when, or even whether, this change in sign occurs. 
However, with aq(0) = x° and £ 2 ( 0 ) = 0, an initially stationary particle, we have 
seen that the control must change sign at least once if the solution is to reach 
the origin. We also constructed the unique solution with bang-bang control that 
changes sign just once. The TOMP now shows that this is indeed the time-optimal 
solution. 

Let’s now consider what happens if the particle is initially moving, with cci(0) = 
x® and £ 2 ( 0 ) = x°. Rather than just solving the equations, let’s think about what 
the time-optimal solution will look like in the phase plane. We know that the 
time-optimal control is bang-bang with u = ±1, so the integral paths are given by 

dxi 

= ±X 2 , 

ax 2 

and hence are parabolas, with 

X 1 = k r -X 2 

for some constant of integration fc. The only two time-optimal paths that reach the 
origin are therefore the appropriate branches of X\ = which are labelled as 

S± in Figure 14.16. On S + we need u = 1, whilst on S'_ we need u = — 1. Any initial 
conditions that lie on these curves can be controlled to the origin without changing 
the sign of u. For any other initial conditions, the system must be controlled 
onto one of the curves S±, when the control must change sign, as illustrated in 
Figure 14.16. 


Example: Controlling a linear oscillator 
For a linear oscillator with unit angular frequency, 

, . ,, ( cos t — sin t, \ 

exp (-At) = . , 

\ sint cos t ) 

and 

exp(-yl()B = ( ) . 

With h T = (a, /?), 

h T exp(— At)B = (3 cos t — a sin t. 

The TOMP therefore shows that the time-optimal control is 

u(t) = sgn(/3 cos t — a sin t) = sgn {a cos (t + 6)} , 

for some constants a and b. Since cos(t + b) changes sign at intervals of 7 r, so must 
u(t). The first change of sign occurs when t = T 0 with 0 < T 0 ^ 7r, and must 



440 


TIME-OPTIMAL CONTROL IN THE PHASE PLANE 



Fig. 14.16. The time-optimal trajectories for the positioning problem. 


subsequently change when t = T n = Tq + nir , for n = 0, 1, 2, . . . . Let’s consider the 
solution when u = ±1 in the phase plane. The governing equations, (14.16), show 
that 


dx i dx 2 _ 

(*i T l)-rr+X2^r = 


dt 


dt 


and hence that 


(xi T l) 2 + x\ = k 2 , 

for some constant of integration, k. These trajectories are circles, centred on (±1, 0), 
as shown in Figure 14.17. Proceeding as we did for the positioning problem, we 
can see that only the circles marked S± in Figure 14.17, given by (iy l) 2 -l-a; 2 = 1, 
enter the origin. We can now construct the time-optimal solution by considering 
the solutions that meet S±, with the control changing sign there. A typical example 
is shown in Figure 14.18. The sign of the control must change with period 7r, so the 
time-optimal solution consists of part of S + or ST, and a succession of semicircles 
of increasing radius with centres alternating between (1,0) and (—1, 0). Intuitively, 
we would perhaps have expected the control to change sign when X 2 = 0. Thinking 
in terms of stopping a child on a swing, we might have expected to push in the 
opposite direction to the motion. In fact, the time-optimal solution is out of phase 
with this intuitive solution in order to allow the velocity to be zero precisely when 
the swing is vertical. 
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Fig. 14.17. The bang-bang trajectories for the problem of controlling a linear oscillator. 


Example: The positioning problem with two controls 
For this problem 


exp (-At)B = ^ q j ) • 

With h T = (a, (3), 

h T exp(— At)B = (a,/3 — at). 

The TOMP therefore says that the time-optimal control U 2 = sgn(/3 — at) changes 
sign at most once, as was the case for the positioning problem with just one control 
variable. In contrast, u\ = sgn(a) does not change sign in the time-optimal control, 
provided that a/0. If, however, a = 0, U 2 does not change sign and u\ is unde- 
termined. Let’s begin by considering this case, with U 2 = — 1. This immediately 
gives us 


X2 = — t, 

which is therefore only appropriate when x.) > 0. The optimal control time is 
therefore t\ = If we now integrate the equation for X\, we obtain 

1 r* 1 

X\ = Xi + x®t — -t 2 + / u\(t) dr. 

2 Jo 
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Fig. 14.18. A typical time-optimal trajectory for the problem of controlling a linear oscil- 
lator. 

Since = 0, we have a time-optimal trajectory for any control u\ such that 

ui(r)dr= -x\- ^{x° 2 ) 2 . 

However, since |ui| ^ 1, this can only be achieved when 

and hence when 

X 2 - \( X l) 2 < X °1 < A - \( X 2) 2 - 

This is marked as region B in Figure 14.19. Similarly, when < 0 and it 2 = 1 we 
have nonunique time-optimal paths in region A, where 

Outside these two regions, the time-optimal control is unique, with u\ = u 2 = —1 in 
region C and U\ = U 2 = 1 in region D, as marked in Figure 14.19. In region C, each 
time-optimal path meets the boundary of region A, where the sign of u\ changes, 
and then follows the boundary of region A to the origin. Similarly in region D. 
Note that, since t\ = in regions A and B where the time-optimal solution 
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Fig. 14.19. The regions A, B, C and D, and some time-optimal solutions for the positioning 
problem with two controls. 


is not unique, the boundaries of the reachable set in time t\, are straight 

lines in these two regions. This means that C(ti) is not strictly convex. Although 
we will not discuss this further, controllable sets that are not strictly convex are 
always associated with nonunique time-optimal solutions. 

In practice, the difficulty with applying bang-bang control lies in determining 
when the control needs to change by making measurements of the state of the 
system. This is known as measurement— action lag (see, for example, Marlin, 
1995). 


Exercises 

14.1 We have seen that the tightrope walker system, x = x+u, is not completely 
controllable. Show, by solving the governing equations, that the other 
two qualitatively different, first order, linear, constant coefficient systems, 
x — — x + u and x — u, are completely controllable. 

14.2 For the tightrope walker system, x = x + u(t) with |m| ^ 1, find the 
reachable set from x = 2 in time t\, R(ti,2) and the set controllable to 
x = 2 in time ti, C(fi,2), and sketch them. Show that i?(ti,2) <{_ R(t2,2) 
and C(t\, 2) (jt. C(t 2 , 2) when t\ < f 2 . 
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14.3 Consider the system of ordinary differential equations 

g?x , , , , 

— = Ax + b(<), 

subject to x(0) = x° = (x\,X 2 ) T , with b (t) = (6i(t), & 2 (f)). Determine the 
exponential matrices, exp(j4f) and exp (—At), and hence write down both 
the general solution and the solution when b is independent of t when A = 


(i) 




(hi) 



0 

0 


14.4 Show that 


exp(Af) exp {Bt) = exp((A + B)t), 


if and only if the matrices A and B commute. 

14.5 Classify each of the following sets of points in the plane firstly as either 
open, closed, or neither open nor closed, and secondly as either strictly 
convex, convex but not strictly convex, or not convex. 

(a) {(xi, x 2 ) | (xi - l) 2 + x\ < 1} U {(xi, x 2 ) | x\ + x\ < l} , 

(b) {(xi, x 2 ) j (xi - l) 2 + x| < 1} C {(xi, x 2 ) j x\ + xl < 1} , 

(c) {(xi, x 2 ) | x 2 + x| < 1} n {(xi, x 2 ) | xi 3s 0} , 

(d) {(aq,^) j x\ + x\ < 1} n {(xi,x 2 ) j x x > 0} . 


14.6 Use the controllability matrix to show that Examples 2, 3 and 4 given in 
Section 14.4 are completely controllable systems. 

14.7 If x = Ax+Bu with u the vector of bounded controls, construct the control- 
lability matrix, M, and hence determine whether the system is completely 
controllable when 


(a) 

(b) 

(c) 




14.8 Consider the positioning problem. Show that for initial conditions lying 
above the curves S ± the switching time is 

T = x° 2 + \jx\+ t)(x 2 ) 2 , 
and that the optimal control time is 


1 1 = Xn 


2 \j x \ + \{x* 2 y. 


Determine the controllable set in time t\. 
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14.9 


Consider the steering problem. Show that 


exp (—At) 


1 

0 


1 

q 



Use the TOMP to show that the time-optimal control has at most one 
change of sign. If cci(O) = £? and £ 2 ( 0 ) = 0, show that the optimal control 
time is 

tl = \ l0g { 6XP + \/ exp - 1 } ■ 

Show that the integral paths for u = ±1 are 

Xi = k- \ {qx 2 ± log (1 =F 9 ^ 2 )} , 

q 

with k a constant. Sketch the time-optimal trajectories and the controllable 
set in time t\ in the (£ 1 , £ 2 )-phase plane. 

14.10 The population of a pest is increasing exponentially. To control the pest, 
a genetically engineered, sterile, predatory beetle is introduced into the 
environment. Since the beetles are harmful to crops, it is desirable to 
remove them as soon as possible. 

Let the populations of the pest and the beetle be denoted by X\ (t) and 
£2 (t) creatures per square metre respectively. Initially, X\ = X > 0 and 
£2 = 0, and the target is to reduce both populations to zero simultaneously 
and in the shortest possible time. Model equations for the two populations 
are 


£1 = £1 — £ 2 , £2 = —£2 + u(t), 


with \u\ ^ 1 representing the rate at which beetles are released into the 
environment. Use the time-optimal maximum principle to show that the 
optimal control changes sign just once. 

Show that the optimal control time is 


ti = log 


X + 1 + VX 2 + 4X ' 
2X — 1 


What is the upper limit on X below which the system is controllable? 
14.11 Consider the system 


£1 = £1 + £ 2 , £2 = —£2 + u(t), 

subject to £i(0) = £°, £ 2 ( 0 ) = £°, with |«(t)| < 1. Use the time-optimal 
maximum principle to show that the sign of the optimal control changes 
at most once. Show that when £2 is initially positive, the time-optimal 
solution on which the control does not change sign has 

x o (^) 2 
2(1 + *§)• 
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14.12 


Determine the matrices exp(— At) and exp (At) when 




Consider the second order system of ordinary differential equations 

dx , „ 

— = Ax + B u, 
at 

where x is the vector of state variables, u the vector of bounded control 
variables with |uj(i)| ^ 1, and 



Use the time-optimal maximum principle to show that the sign of u[t) 
can change no more than once in the time-optimal control. Show that the 
time-optimal solutions on which the control does not change sign have 

cosh t\ H 7 = 

V2 


A = ± ( 7= coshti + - 7 = ) , 


Xi = =b sinhti -= 


where t\ is the time taken to control the system to the origin. 
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If we now define the dimensionless variables 


l 


x = 


2 a 2 [I — a) 


Y, t = 


l-i(l — a) 


t, 


we obtain 


x + eSx — x + x 3 = F(i ), (15.2) 

where 

The reason for our rather odd choice of dimensionless parameters will become clear 
later. Equation (15.2) with F = 0 is known as Dufflng’s equation, and arises 
in many other contexts in mechanics. With F / 0, (15.2) is the forced Duffing 
equation.) We will assume that the forcing takes the form F = eycos u>t (dropping 
the hat on the time variable for convenience), so that (15.2) becomes 

x + e6x — x + x 3 = eycos tot. (15.3) 



Fig. 15.1. A mechanical system whose small amplitude oscillations are governed by the 
forced Duffing equation, (15.3). 

Figure 15.2 shows the solution, calculated numerically using MATLAB (see Sec- 
tion 9.3.4), when e6 = ^ ey = | and y = dy/dt = 0 when t = 0. This solution 

f For more on the dynamics of the forced Duffing equation, see Arrowsmith and Place (1990) 
and references therein. 



CHAPTER FIFTEEN 


An Introduction to Chaotic Systems 


In order to introduce the idea of a chaotic solution, we will begin by studying 
three simple chaotic systems that arise in different physical contexts. We then look 
at some examples of mappings, which are important because ordinary differential 
equations can be related to mappings through the Poincare return map. After 
investigating homoclinic tangles in Poincare return maps, which contain chaotic 
solutions, we investigate how their existence can be established by examining the 
zeros of the Mel’nikov function. Finally, we discuss the computation of the Lya- 
punov spectrum of a differential equation, from which a quantitative measure of 
chaos can be obtained. 


15.1 Three Simple Chaotic Systems 

15.1.1 A Mechanical Oscillator 

Consider the mechanical system that consists of two rings of mass m threaded 
onto two horizontal wires a distance a apart, as shown in Figure 15.1. The rings 
are joined by a spring of natural length l > a that obeys Hooke’s law with elastic 
constant y. If we move the upper ring, what happens to the lower ring? We denote 
the displacement of the upper ring from a fixed vertical line by and that of 
the lower ring by y(t). On the assumption that a frictional force of magnitude 
inky opposes the motion of the lower ring, Newton’s second law in the horizontal 
direction shows that 




a 2 + {y - <j>) 2 - {y - $) 

\J a 2 + {y ~ 4>) 2 


— mky = my. 


(15.1) 


where a dot denotes d/dt. Let’s assume that the relative displacement of the two 
rings, Y = y — <f>, is much less than the distance, a, between the wires, so that 


V a? + (y - 4>) 2 = a\j 1 + ~ a 


2 a 


In terms of Y, (15.1) becomes 
Y + kY- 


, nl 


zma° 


ma 



15.1 THREE SIMPLE CHAOTIC SYSTEMS 


449 


behaves erratically. Indeed, it is tempting to think of it as ‘random’ in some sense. 
However, we know from Chapter 8 that the solution of (15.3) exists and is unique. 
Figure 15.3 superimposes another solution, this time with y = ICC 4 , dy/dt = 0 
when t = 0. The two solutions begin close to each other, but, as time increases, 
drift further apart, and soon diverge completely. We say that the system exhibits 
sensitive dependence upon initial conditions. In practice, we can only know 
the initial state of a physical system with a finite degree of accuracy. After a suf- 
ficient time has elapsed, solutions with different initial conditions, but which are 
close enough together that, in practice, they are indistinguishable, will diverge in a 
chaotic system. 



Fig. 15.2. The solution of the forced Duffing equation, (15.3), when e6 = |, ey = | and 
y = dy/dt = 0 when t = 0. 


We can now give an informal definition of a chaotic solution as a bounded, 
aperiodic, recurrent solution, that has a random aspect due to its sensitive depen- 
dence on initial conditions. Adjacent chaotic solutions diverge exponentially fast, 
a property that we will later measure using the Lyapunov spectrum, and remain 
in a bounded region, where they undergo repeated folding, and are, in practice, 
unpredictable in the long term. 
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Fig. 15.3. The solution of the forced Duffing equation, (15.3), when e<5 = |, e 7 = | and 
y = dy/dt = 0 when t = 0 (solid line) and y = 10 -4 , dy/dt = 0 when t = 0 (broken line). 


15.1.2 A Chemical Oscillator 

The next example that we will consider is a well-stirred system of reacting chem- 
icals, known as the cubic crosscatalator, which has at its heart the cubic auto- 
catalytic step that we studied in Chapter 13. The reaction scheme is 

P — > A rate kg p, precursor decay, (15.4) 

P + C — > A + C rate k\pc, catalysis of precursor decay, (15.5) 

A — > B rate k u a, uncatalyzed conversion, (15.6) 

A + 2B — > 3B rate k\ab 2 , cubic autocatalysis, (15-7) 

B — » C rate fc 2 &, autocatalyst decay, (15.8) 

C — > D rate k%c, catalyst decay. (15.9) 

In addition to the reactant, A, and autocatalyst, B, there is a precursor, P, which 
decays to produce A, and a catalyst, C, which accelerates the decay of the pre- 
cursor, and is produced by the decay of the autocatalyst. The action of C both at 
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the start of the reaction cascade, catalyzing the decay of the precursor, and as a 
product at the end of the sequence P— >A— >B— >Cis the essential ingredient that 
leads to complex behaviour. The reactant A also decays spontaneously to produce 
B, and C is itself unstable, decaying to the inert product D. 

Under the assumption that the precursor, P, is in large excess and decays slowly, 
we can derive the dimensionless governing equations (see Exercise 15.1) 

a = k (1 + ii 7 ) — a/3 2 — ea, 

(3 = a/3 2 — [3 + ea, (15.10) 

7 = P~X 7, 

where a, (3 and 7 are the dimensionless concentrations of A, B and C, and k, 
77 , e and x are dimensionless constants. Although this system possesses a very 
complicated set of different types of behavior (see Petrov, Scott and Showalter 
1992), we will focus on a single chaotic solution. Figure 15.4 shows the solution 
when k = 0.71, 77 = 0.054, e = 0.005 and x = 0.25. The single equilibrium point 
has a one-dimensional stable manifold and a two-dimensional unstable manifold 
associated with complex eigenvalue. The solution is continually attracted towards 
the equilibrium point close to the stable manifold, and then spirals away close to 
the unstable manifold before the process begins again. 


15.1.3 The Lorenz Equations 

Our third example is the system of three autonomous differential equations, 

8 

x = --x + yz, 

V = -10 (y-z), (15.11) 

z = —xy + 28 y — z, 

known as the Lorenz equations. Equations (15.11) were derived by Lorenz (1963) 
as the leading order approximation to the behaviour of an idealized model of the 
Earth’s atmosphere. To claim that these simple equations model the weather is 
perhaps going a little too far, but they certainly have very interesting dynamics. 

There are three equilibrium points, one at the origin and two at x = 27, 
y = z = ±6\/2, each of which is unstable. The two equilibrium points away 
from the origin each have a two-dimensional unstable manifold, associated with 
complex eigenvalues, and hence oscillatory behaviour, and a one-dimensional sta- 
ble manifold. The system is rather like two copies of the cubic crosscatalator system 
interacting with each other. Typical solutions bounce back and forth between the 
two equilibrium points away from the origin, continually being attracted towards 
an equilibrium point along a trajectory close to the stable manifold, and then spi- 
ralling away close to the unstable manifold, as shown in Figure 15.5 for the solution 
with x = y = z = 1 when t = 0 . 

The best way to get a feel for the dynamics of the Lorenz equations, and an 
indication of their iconic status as the first chaotic system to be discovered, is to 
type lorenz in MATLAB, which runs an animated simulation. 
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Fig. 15.4. The solution of the cubic crosscatalator equations, (15.10), with k = 0.71, 
r / = 0.054, e = 0.005, x = 0.25 and a(0) = /?( 0) = 7(0) = 0. 



15.2 Mappings 

Although this book is about differential equations, we will see later that it is often 
helpful to relate the solutions of differential equations to those of mappings, some- 
times known as difference equations. Before we give some basic definitions, let’s 
consider three examples of nonlinear mappings, which illustrate how complicated 
the solutions of these deceptively simple systems can be. 


Example: A shift map 

The shift map is defined by 


x 1 — > ax 1 1, (15.12) 

which maps [0,1) to itself, where b\c means the remainder when b is divided by 
c. Let’s focus on the case a = 10. The shift map is then equivalent to shifting 
the decimal point one place to the right and throwing away the integer part. For 
example, | = 0.125 maps to 0.25, which maps to 0.5, which maps to zero, an 
equilibrium, or fixed point of the map. Another equilibrium point of the map 
is y = 0.11111... . The map also has many periodic solutions, for example, 
A = 0.09090909 . . . i— > yy = 0.909090 . . . i— > yy i— > . . . , which has period 2. 
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Fig. 15.5. The solution of the Lorenz equations, (15.11), with x(0) = y(0) = z( 0) = 1. 



Every rational number has either a finite decimal expansion, and therefore is 
eventually mapped to zero, for example |, or has a repeated decimal expansion, 
in which case it is part of a periodic solution, for example yj. Irrational numbers, 
for example \/2, n and e, have decimal expansions that do not repeat themselves 
and have no pattern, for example — 1 ss 0.4142135... i— > 0.1421356... i— > 
0.4213562 i— > 0.2135623 i— > .... In addition, two irrational numbers may be arbi- 
trarily close to each other, but the corresponding solutions eventually diverge. For 
example, consider x = X\ = \/2 — 1 and x = X 2 = v/2 — 1 + 10 _5 7r k, 0.4142449 .... 
These differ by just 10 _5 7r, but after four iterations of (15.12), X\ maps to 0.135 . . . 
whilst X 2 maps to 0.449 .... This is a simple example of sensitive dependence on 
initial conditions. Each solution initially at an irrational number therefore satisfies 
our conditions to be called a chaotic solution, since they behave in an apparently 
random manner, and initially close solutions diverge. The chaotic solutions, which 
correspond to the irrational numbers, are dense in [0, 1), as are the periodic solu- 
tions, which correspond to rational numbers with no finite decimal representation. 
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Example: The logistic map 

Consider the logistic map 

x n +i = rx n (l - x n ), (15.13) 

where r is a constant, with 0 < r ^ 4 and n an integer. The logistic map is a simple 
model for the growth of the population of a single species. Starting from an initial 
population, Xo, (15.13) gives a measure of the population in subsequent generations. 
The state x n = 0 represents the complete absence of the species, and for x n <C 1, 
x n+ i ~ rx n , so that the next generation grows by a factor of r. When x n is not 
small, the factor (1 — x n ), which models the effect of overcrowding and competition 
for resources, is no longer close to unity, and the full, nonlinear equation, (15.13), 
determines the size of the next generation. Note that if 0 ^ Xq ^ 1, then 0 ^ x n ^ 1 
for n ^ 0. The interval [0, 1] is also the physically meaningful range for this map. 

Let’s begin by trying to find the fixed points of (15.13). These satisfy x n+ \ = x n , 
and hence x n = rx n { l — x n ). The fixed points are therefore x = 0 and x = (r — l)/r. 
The nontrivial fixed point lies in the meaningful range only if r > 1. Let’s now 
determine whether there are any solutions of period 2, or 2-cycles. These satisfy 

X-n+l ^**£ 71(1 *£ro); 

Xn+2 “ Xn “ I*^'n+l(l %n+ 1)* 

By eliminating x n +\, we obtain the equation for x n , 

x„ (x„ - ^ {r 2 x 2 n - r( 1 + r)x n + (1 + r)} = 0. 

This equation is easy to factorize, since we know that it must also be satisfied by 
the two fixed points. The discriminant of the quadratic factor is r 2 (r 2 — 2r — 3), 
which is positive provided r > 3. This means there are points of period 2 for r > 3. 
In fact, it can be shown that the nontrivial fixed point is stable for r < 3, but loses 
stability in a bifurcation) at r = 3, where the points of period 2 emerge and are 
stable. Similarly, as r increases, the points of period 2 eventually lose stability, and 
a stable 4-cycle emerges. This process is known as period doubling, and, as r 
increases, eventually leads to chaotic solutions. We will not go into the details of 
this process, since we want to concentrate on maps relevant to differential equations. 
The interested reader is referred to Arrowsmith and Place (1990). Figure 15.6 shows 
the period doubling process as a bifurcation diagram. This was produced using the 
MATLAB script 


f a flip bifurcation 
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r = 0:0.005:4 
x = rand(l) ; 
for j = 1:100 

x = r*x*(l-x) 


end 

xout = [] ; 
for j = 1:400 

x = r*x*(l-x); xout = [xout x] ; 


end 

plot(r*ones(size(xout)) ,xout, ’ . ’ , ’MarkerSize 1 ,3) 
axis([0 4 0 1]), hold on, pause(O.Ol) 



This iterates the map 100 times, starting from randomly generated initial conditions 
(x = rand(l)), and then saves the next 400 iterates of the map before plotting 
them. Figure 15.7 shows 100 iterates of the logistic map with r = 4 for two initial 
conditions separated by just 10 -16 . The apparently random nature of the solution 
can be seen, as can the fact that these initially very close solutions have completely 
diverged after about 50 iterations of the map. 



Fig. 15.6. The bifurcation diagram for the logistic map. 
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Fig. 15.7. A sequence of 100 iterates of the logistic map with r = 4 and xo = 0.1 and 
x 0 = 0.1 + 10" 16 . 


We will also discuss another phenomenon that can occur in the logistic map, 
known as intermittency. A system that behaves in an intermittent manner 
exhibits bursts of chaotic behaviour interspersed with simpler, more regular be- 
haviour. Many other systems, both maps and flows, can exhibit intermittency, 
which often begins to occur as a parameter is changed, and is a prelude to full 
chaos. For example, the flow of water down a pipe is smooth and steady, or lami- 
nar, at sufficiently low flow rates. f At sufficiently high flow rates, the flow becomes 
unsteady and chaotic, or turbulent. However, at intermediate flow rates, tur- 
bulence can appear in intermittent, spatially localized bursts (see, for example, 
Mathieu and Scott, 2000). 

Consider the fifth iterate of the logistic map, f( 5 \x), which is shown in Fig- 
ure 15.8 for r = 3.7. Note that there are four points where the curve is close 
to the straight line that represents the identity mapping, f(x) = x, but does not 
touch it. We will focus on the point close to x = 0.65, the image of which is 
shown in Figure 15.8, and look for the nearby values of x and r at which f^(x) 
actually touches the straight line. This is then a fixed point of f^(x), and hence 
part of a periodic solution of the logistic map of period 5. It is straightforward 


f Strictly speaking, at sufficiently low Reynolds numbers. 



15.2 MAPPINGS 


457 



Fig. 15.8. The fifth iterate of the logistic map for r = 3.7. Also shown is the image of the 
point x = 0.65. 


to show, using MATLAB, that g(x) = f^ 5 \x) — x and its derivative are zero when 
r = r c « 3.73817237526634 and x = x c « 0.66045050397608. Figure 15.9 shows 
/( 5 ) (x) in the neighbourhood of this point when r = r c and for two values of r 
slightly above and below r c . We can see that for r > r c there are two fixed points, 
one stable, one unstable, whilst for r < r c , locally there are no fixed points. Clearly, 
r = r c is a bifurcation point of the map /(% analogous to the saddle-node bifurca- 
tion that we discussed in Section 13.3.1, and is known as a tangent bifurcation. 
For values of r slightly greater that r c , points initially close to x c are attracted 
to the fixed point, and therefore the solution of the logistic map is attracted to 
a stable periodic solution of period 5. For r slightly less than r c , we can see in 
Figure 15.9 that f^ 5 \x) is very close to the straight line, and that iterates of the 
map can be trapped close to x = x c before moving away, and also close to the 
other points where the curve is close to the straight line. The solution therefore 
exhibits almost steady behaviour before moving away and behaving irregularly, as 
shown in Figure 15.10. This is intermittency. The equivalent solution of the logistic 
map displays almost periodic behaviour interrupted by chaotic bursts, which is also 
shown in Figure 15.10. For further examples of intermittency, see Guckenheimer 
and Holmes (1983). 
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Fig. 15.10. The action of the logistic map and its fifth iterate with x\ = x c and r = 
r c - 1 x 10“ 6 . 

Example: The Henon map 

The Henon map is a nonlinear, two-dimensional map, which is defined by 
x n +\ = x n cos 9 — y n sin 9 + x^ sin 9 , 

Un+i = x n sin 9 + y n cos 9 — x ^ cos 9. 

This is the most simple area-preserving, quadratically nonlinear map with linear 
part a rotation through the angle 9 about the origin. We can see that it preserves 
areas, since the determinant of its Jacobian, the definition of which we will discuss 
in more detail below, is 


dx n+1 

^*^n+ 1 


dx n 

dy n cos 0 + sin 9 

— sin 9 

dy n + i 

dy n + i sin 9 — 2x n cos 9 

cos 9 

dx n 

dy n 



In Figure 15.11 we show the results of an iteration using the MATLAB script 
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cosa=0.24; sina=sqrt (l-cosa~2) ; 
ii=0; 

for st=-0 . 5 : 0 . 05 : 0 . 5 
x=st ; y=st ; 
for its=l:1000 
ii=ii+l ; 

xn=cosa*x-sina*y+x~2*sina; 
yn=sina*x+cosa*y-x~2*cosa; 
x=xn; y=yn; po(ii)=x+i*y ; 
if ((abs(x)>10) I (abs(y)>10)) 
break 

end 

end 

end 

plot (po MarkerSize ’ , 2) 
axis equal, axis([-l 1 -1 1]) 


Note that we have used MATLAB’s ability to plot complex numbers, and that I 
is the logical or operator. In addition, it is necessary to confine the iterates by 
stopping the calculation once the points have left the domain of interest. 


cos 9 = 0.24 cos 0 = 0.3 




Fig. 15.11. Iterates of the Henon map for cos# = 0.24 and cos# = 0.3. 


It is clear that this apparently simple map gives rise to extremely complex be- 
haviour. The regions where there are concentric sets of points, or islands, owe their 
appearance to the existence of periodic solutions. In particular, for cos 6 = 0.24 
we see evidence of a period 5 structure. In fact, there is a 5-cycle consisting of the 
five points at the centre of the chain of five islands, separated by five further un- 
stable equilibrium points. Chaotic solutions exist near these points. If we consider 
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other values of cos#, other periodic structures become evident (for instance, when 
cos# = 0.34 we find a structure of period 11). We also note that if we consider 
certain regions in more detail, we can see structures with even longer periods (see, 
for example, Figure 15.12, which we obtained from Figure 15.11 using MATLAB’s 
ability to zoom in on a small region of an existing figure) . 



Fig. 15.12. Iterates of the Henon map for cos 9 = 0.24 in the neighbourhood of one of the 
hyperbolic fixed points. 


15.2.1 Fixed and Periodic Points of Maps 

Consider the map 


x„ +1 =g(x„), (15.14) 

where g : K" — > K™ is a diffeomorphism; that is g is a bijection and is dif- 
ferentiable with a differentiable inversef. A fixed point, x, of the map satisfies 
x = g(x). A point of period k, x*, satisfies x* = g fe (x*), and x* ^ g m (x*) for 
all positive integer m < k. 


f Note that, of the maps that we consider in this section, only the Henon map is a diffeomorphism, 
since the logistic, tent and horseshoe maps are not bijective and have discontinuous derivatives. 
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Let’s now consider the linearization of (15.14) about a fixed point, x. If we let 
x„ = x + x„ we find that, at leading order, 

x„+i = Dg(5c)x n , (15.15) 

where Dg(x) is the Jacobian matrix associated with g at x = x. For one-dimensional 
maps, the solution of x n +± = g'(0)x n is x n = (</(0)) ra Xo, and hence the stability 
of x = 0 depends upon whether |</(0)| is greater than or less than unity. Similarly, 
(15.15) has solution 

Xn = (-Dff(x))”x 0 , (15.16) 

and the stability of the fixed point depends upon the size of the moduli of the n 
eigenvalues. We say that the equilibrium point is hyperbolic if none of the eigen- 
values of its Jacobian have modulus unity, and that it is nonhyperbolic otherwise. 
In the same way as we found for differential systems, in the neighbourhood of an 
equilibrium point of a map, there exist three invariant manifolds. 

(i) The local stable manifold, tDf oc , of dimension s, is spanned by the eigen- 
vectors of Dg(fc) whose eigenvalues have modulus less than unity. 

(ii) The local unstable manifold, wjL, of dimension u, is spanned by the 
eigenvectors of Dg(x ) whose eigenvalues have modulus greater than unity. 

(iii) The local centre manifold, wf oc , of dimension c, is spanned by the eigen- 
vectors of Dg(x) whose eigenvalues have modulus equal to unity. 

These local manifolds exist as global manifolds when we consider the full, nonlinear 
system, just as they do for systems of differential equations. 

Theorem 15.1 Consider the diffeomorphism (15.14). 

(i) The stable manifolds of different equilibrium points cannot intersect. 

(ii) The unstable manifolds of different, equilibrium points cannot intersect. 

(iii) The stable manifold of an equilibrium point cannot intersect itself. 

(iv) The unstable manifold of an equilibrium point cannot intersect itself. 

Proof Recall that every point x £ ffi" has a unique image and preimage, since g is 
a diffeomorphism. 

(i) Since a point in the stable manifold of an equilibrium point, x, must asymp- 
tote to x as n — > oo, it cannot also asymptote to a different equilibrium 
point as n — > oo, which proves (i). 

(ii) As for case (i) but with n — > — oo. 

(iii) Consider a point, x*, that is mapped to a point of intersection of the stable 
manifold with itself, g(x*). Points on the stable manifold in the neighbour- 
hood of x* must map to points on the stable manifold in the neighbourhood 
of g(x*). Since g is continuous, this is not possible. 

(iv) As for case (iii), but on the unstable manifold. 


□ 
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Although this theorem eliminates several possibilities, it is possible for a stable 
manifold to intersect an unstable manifold, either of the same equilibrium point or 
a different one. This will prove to be crucial later. 


15.2.2 Tents and Horseshoes 

Before we return to consider systems of differential equations, we will examine 
two more examples of maps, which will prove to be of direct relevance later. 

Example: A tent map 

A tent map is a map x i— > f(x) where / : R. — > R and 

,, , f sx for x ^ h, 
s(l-x) iovx^l 

We will concentrate on the case s = 3, when the function / is as shown in Fig- 
ure 15.13. Note that /( x) = 1 when x = | or x = | and that the only equilibrium 



Fig. 15.13. The function f(x) when s = 3. 

point of the map is zero. Let’s now consider where this tent map sends various sets 
of points. 

(i) If x < 0, x e- > 3a: < x. Clearly, /"( x) — > — oo as n — > oo for x £ (— oo, 0). 

(ii) If .t > 1, £ e- > 3(1 — a:) <0. After one iteration, all points with x > 1 are 
therefore mapped to a point with x < 0, and case (i) applies for subsequent 
iterations, with f n {x) — » ^oo as n — > oo for x € (l,oo). 
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(iii) If x € (|, |), x i— > f(x) > 1. After one iteration, all points with | < x < 
| are therefore mapped to a point with x > 1, and case (ii) applies for 
subsequent iterations, with /"(x) — -> — oo as n — ■> oo. 

(iv) Now consider points with x € [0, |] U [|,1], which are mapped to [0,1]. 
Only points in the set [0, |] U [|, |] U [|, |] U [|,1] have images in the set 
[0, |] U [|, 1], so the remaining points have f n (x ) — > — oo as n — > oo. 

If we continue this process, we can see that the set of points, AT, that are not expelled 
from [0,1] as n — » oo can be constructed in an iterative way by deleting the middle 
third from K 0 = [0, 1] to leave K\ = [0, g] U [|, 1], then deleting the middle third 
from each of the remaining intervals to leave K 2 = [0, |] U [|, |] U [|, |] U [|,1], 
and so on, with K n the union of the 2" closed subintervals [r/3 n , (r + l)/3 ra ], each 
of which has length 3~ n . The index r takes the values in the set R n , which can be 
generated iteratively using 

R 0 = {0} , R n = {Rn-u 3" - 1 - R n - 1} for n ^ 1. 


We then have 

OO 

K= H Kn- 

n = 0 

which is known as Cantor’s middle-third set and was first constructed by Cantor 
in 1883 as an example of an infinite, completely disconnected set. All points 
initially in K remain in K as n — > oo, so that K is a positively invariant set for the 
tent map with s = 3. Note that the length of K n is 2 n /3 n , which tends to zero as 
n — > oo, and hence the length of K is zero. 

A more convenient definition of K is provided by looking at the numbers in 
[0, 1] expressed in ternary, or base three form. We can write all points in K as 

x = Q.d\d 2 ds . . . = di/3 + d2/3 2 + CZ3/3 3 H , with di = 0 or 2 for n = 1, 2, . . . . To 

see this, note that each point in K { has ternary form 0 . ... if it lies in [0, |] = 
[0,0.0222...] and 0.20:2X3... if it lies in [|,1] = [0.2,0.222...]. Similarly, each 

x in K 2 has ternary form O.OOX3X4 . . . , 0. 02x3X4 . . . , 0. 2OX3X4 ... , or 0.22x3X4 . . . 
according to which of the four subintervals of K 2 it lies in, and so on. We can 
see that excluding the middle third of each successive subinterval is equivalent to 
excluding numbers that have d, = 1 for * = 1,2 ,.... 

Lemma 15.1 Cantor’s middle-third set, K, has uncountably many points. 

Proof Suppose that K is countable, so that all the members of K can be ordered as 

xi = O.X11X12X13 ... < x 2 = O.X21X22X23 . . . , in ternary form. We can now define 

U = 0. i/i2/22/3 ■ • • , with y m = 0 if x mm = 2 and y m = 2 if x mm = 0. Then 0 < y < 1, 
y G K and y ^ x n for all n - a contradiction. The elements of K cannot therefore 
be ordered, and must be uncountable. f □ 

f This is just a variation of Cantor’s famous diagonalization proof that the real numbers are 
uncountable. 
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Nonetheless, by construction, K contains no subinterval, however small, of [0,1], 
and [0, 1] contains infinitely many subintervals that do not intersect K. 

Other Cantor sets can be constructed by successively removing different parts 
of [0, 1]. An invariant set that is also a Cantor set is, as we shall see, typical of a 
chaotic system. 


Example: The Smale horseshoe map 
Consider a two-dimensional map from the unit square, 

J)={(i,!/)el 2 | 0 < x < 1, 0 < y < 1} , 

to itself, / : D — > D. The function / contracts the square in the horizontal direction 
and expands it in the vertical direction, and then folds the resulting strip back on 
itself, as shown in Figure 15.14. The map is only defined on the unit square, D , 
and points that are mapped out of D are discarded. This is the Smale horseshoe 
map. The inverse map, which can be visualized in terms of stretching and folding 
the unit square in the opposite way, is shown in Figure 15.15. 


discard 



stretch 





Fig. 15.14. The Smale horseshoe map. 

The invariant set, A, which is both positively and negatively invariant, is the 
intersection of the image of D under any number of forward or backward iterations, 

A = • • • n r\D) n r\D) n D n f(D) n f(D) n • • • . 

As we can see, each application of the map removes the middle third of each re- 
maining strip in each direction, so we conclude that A is a two-dimensional Cantor 
set given by the set of points (x,y) such that x € K x and y £ K y , where K x and 
K y are the Cantor middle-third sets in each direction. 

It is now helpful to show that each point in the invariant set can be uniquely 
labelled by a sequence of 0’s and l’s. After each forward iteration of the map, we 
append a*, to the right of the symbol sequence, where 

J 0 for points in the left half of D , 

ak \l for points in the right half of D , 
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v 0 


V 

1 



• 


stretch 


O 

> 


. v i 




• 


bend 



Fig. 15.15. The inverse Smale horseshoe map. 



Fig. 15.16. The construction of the left half of the hi- infinite sequence. 
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as illustrated in Figure 15.16. After each backward iteration of the map we append 
afc to the left of the symbol sequence, where 

J 0 for points in the bottom half of D , 
ak (1 for points in the top half of D , 

as shown in Figure 15.17. We can then represent any point in A as the bi-infinite 



Fig. 15.17. The construction of the right half of the bi-infinite sequence, 
sequence of symbols, 


a = ... a 2 aiao.a_ia _2 • • ■ , 

as shown in Figure 15.18. The digits to the right of the decimal point reflect the 
vertical location and to the left the horizontal location. For any point in A, the 
action of the horseshoe map is simply to shift the decimal point in the bi-infinite 
sequence one place to the right. The map is therefore equivalent to the shift map 
x i— > 2x\l, which is known as the Bernoulli shift map. For the same reasons that 
the shift map that we studied as our first example had chaotic solutions, so does 
the Bernoulli shift map, and hence the horseshoe map. 


15.3 The Poincare Return Map 

Now that we have seen several examples of maps, we will demonstrate how the 
solutions of a system of ordinary differential equations can be related to the solutions 
of a map, the Poincare map, that is easier to work with than the original system, 
and then develop a technique for deciding whether this map is chaotic. 

Consider the autonomous system of ordinary differential equations 

- = f(x) , (15.17) 

where f : R™ — > R™ and x £ R". We introduce the flow evolution operator or 
flow operator, </> tl,to , which maps the point x 0 £ R" and an interval (t 0 ,fi) G R 
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Fig. 15.18. The points in A, the invariant set. 


to the point Xj € R", where x x = x(ii) and x is the solution of (15.17) subject 
to x = x 0 when t = t 0 - That is x(t 0 ) = x 0 evolves to x(£i) = X! in the n- 
dimensional phase space during the time interval (to,ti). Note that ^ to,to (xo) = Xo 
and • (j) tl,t 0 = </>* 2,t ° for all t\ £ (to, £ 2 )- In addition, since (15.17) is an 

autonomous system, we can express the flow evolution operator in terms of the 
length of the time interval alone, and we write 

0 tl,to (x o ) = 0 n ~ to (x o ). 

Consequently we have that <jp is the identity transformation, eft ■ (f> s = <fi t+s and 
(ft*) 1 = so that ^ is a group operator (see Chapter 10). 


Example: The logistic equation 
Let’s construct the flow operator for the logistic equation, 



This is the continuous version of the logistic map, which we studied earlier. The 
logistic equation is separable, and we can therefore integrate it subject to the initial 
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condition that x = x$ when t = 0 to obtain the solution, and hence the evolution 
operator, 


x{t) = ^(xq) 


Xpe* 

x 0 e ( — xq + 1 


Note that solutions of the logistic equation all have x — > 1 as t — > oo, in complete 
contrast to the solutions of the logistic map. This shows that choosing to use a 
continuous rather than a discrete system can have dramatic consequences for the 
behaviour of the solutions. 

We can now verify that <jr{x o) = Xp and 


ft • </> s (x 0 ) = <?(</>* (xo)) 


<t )S (xp)e t 

4 >s (xp)e t - 0 s (xp) + 1 ’ 


and, since (j) s {xp ) = Xpe s /{xpe s — Xp + 1), 


<t>' • V{xp) 


xpe s e t 

xpe s e t — Xpe s + {xpe s — xp + 1 ) 


x 0 e s+t 

xpe s+t — Xp + 1 ’ 


which is equal to (j) s+t ( xq), as expected. 


Equilibrium points of (15.17) satisfy 0*(x*) = x* for all t £ R. (or equivalently 
/(x*) = 0). Periodic solutions of (15.17) with period T satisfy x*(t) = x*(t + T) 
for all tsR, and can be written in terms of the flow operator as <fi t+T (x.*) = </>*(x*) 
for all f el. 

One way of obtaining a map from the system (15.17) is to sample the solution 
with period r, so that </> n (x 0 ) = x(f 0 + nr) for n € Z. This allows us to track 
the trajectory of particles in a stroboscopic way. A more useful map, which we 
can use to analyze the behaviour of integral paths close to a periodic solution, is 
the Poincare return map. Let 7 be a trajectory of (15.17), and consider the 
intersections of 7 with E C K n such that 

(i) E has dimension n — 1, 

(ii) E is transverse (nowhere parallel) to the integral paths, 

(iii) all solutions in the neighbourhood of 7 pass through E. 

If 7 intersects E at x = p, and its next intersection with E is at x = q, then 
the Poincare return map, P : E — > E, maps the point p to the point q. This is 
illustrated in Figure 15.19 for n = 3 where E is a plane. Note that an equilibrium 
point of the Poincare return map corresponds to a limit cycle that intersects E 
once, and a periodic solution of period k corresponds to a limit cycle that intersects 
E k times. 


Example 

Let’s try to construct the Poincare return map for (15.17) with n = 2 when 


f 4a: + Ay — x(x 2 + y 2 ) \ 

V 4y - 4a: - y{x 2 + y 2 ) ) ’ 
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and E is the positive x axis. It is easier to write this system in terms of plane polar 
coordinates, and, using (9.22) and (9.23), we find that r = r(4 — r 2 ) and 6 — —4. 
These expressions can be integrated with initial conditions in E, the positive ai-axis, 
so that r(0) = Xq and 0(0) = 0, to obtain 


r = 


4a: (je 84 


4 — Xq + Xge 8t ' 


6 = -4 1 . 


The orbit next returns to this axis when 9 is an integer multiple of 27 t, that is when 
t = 7t/2, so that 


x i = P{x 0 ) 


4-z 2 


The solution for xq = 10 is shown in Figure 15.20, along with the Poincare return 
map, which rapidly asymptotes to x = 2. 


Example: The Lorenz equations 

Solutions of the Lorenz equations repeatedly cross the plane y = —z, which sepa- 
rates the two equilibrium points at x = 27, y = z = ±6\/2. Indeed the equations 
are symmetric about this plane, since they are unchanged by the transformation 
y e- > —y, z i— > —z. We can therefore use the plane y = — z as E, and hence de- 
fine a Poincare return map. In order to investigate this map, we must proceed 
numerically. In MATLAB, we need to define an event function 
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Fig. 15.20. The solution when xo = 10 and the corresponding Poincare return map. 


function [value, isterminal, direction] = lorenzevent (t ,y) 
value = y(2)+y(3); isterminal = 0; direction = 1; 

V __ j 

This function returns the value zero when the solution intersects E (when y + z = 
y(2)+y(3) = 0), but only as the solution approaches E with y + z increasing 
(direction = 1), and allows the integration to continue (isterminal = 0). We 
can then use the commands 



This integrates the Lorenz equations, which are in the function lor 



for 0 ^ t ^ 500, starting from x = y = z = 1 when t = 0. Note that the function 
odeset allows us to create a variable options that we can pass to ode45 as an 
argument, which controls its execution. In this case, we tell ode45 to detect the 
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event coded for in lorenzevent. The variablef solution then contains the points 
where the solution crosses £, in solution. ye, at times solution, te. The Poincare 
return map is shown in Figure 15.21. As you can see, the map is effectively one- 
dimensional, since the points where the solution meets £ lie on a simple curve. 
The dynamics are, however, apparently chaotic (see Sparrow, 1982, for a detailed 
discussion of the Lorenz equations). 
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15.4 Homoclinic Tangles 

Let’s now consider a two-dimensional Poincare return map, with f : £ — > £ and £ C 
M 2 , associated with a three-dimensional system, (15.17). Each limit cycle solution 
of (15.17) is associated with an equilibrium point, x, of the map. If this equilibrium 
point is stable, the limit cycle is stable. Let’s assume that a particular equilibrium 
point, x of the Poincare return map, which corresponds to a limit cycle solution 

f The variable solution, produced by ode45, is a structure, and effectively contains more than 
one type of data in its definition. 


Points in the Poincare return map 



20 25 30 35 40 

x 

Succesive iterations of the Poincare return map 



50 100 150 200 250 300 350 400 450 500 


t 

Fig. 15.21. A Poincare return map for the Lorenz equations. 
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of (15.17), has a one-dimensional stable manifold, a/(x), and a one-dimensional 
unstable manifold, w u (x). As we saw earlier, it is possible for these manifolds to 
intersect. As we shall see, if this intersection is transverse, the manifolds become 
tangled, intersecting infinitely often, as shown in Figure 15.22. The manifolds then 
contain embedded horseshoe maps, and thus have chaotic solutions. We will now 
discuss why this should be so. Although we will proceed through lemmas and 
theorems, our approach is fairly informal, and should not be read as a rigorous 
proof. 



Fig. 15.22. A homoclinic tangle. 


A homoclinic point is a point x ^ x that lies in the set u±(x) fl w u (x), the 
intersection of the stable and unstable manifolds of x. Such a point asymptotes to x 
as n — > ±oo. Under successive applications of the map and its inverse, a homoclinic 
point is mapped to a homo clinic orbit, a discrete set of points that is a subset of 
the stable and unstable manifolds. 

We will now assume that the map f is area-preserving, so that, for any set 
DCS, the area of f(D) is equal to that of D. This assumption greatly simplifies 
the following discussion, but is not actually necessary (see Wiggins, 1988). 

Lemma 15.2 //x o is a transverse homoclinic point, then all positive and negative 
iterates ofxg have the same orientation, either US or SU, as shown in Figure 15.23. 


Proof The Poincare return map is associated with a flow. Figure 15.24 shows 
the trajectory through the points x 0 and f(x 0 ). Since the stable and unstable 
manifolds associated with this trajectory vary continuously, their orientations can 
at most rotate and cannot flip during their passage from one side of E to the other. 

□ 
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Fig. 15.23. US and SU orientation at a homoclinic point. Remember that the manifolds 
are stable or unstable with regard to the equilibrium point x, not xq. 



Fig. 15.24. The stable and unstable manifolds of the limit cycle. 


Lemma 15.3 If xo is a transverse homoclinic point, then there must be another 
transverse homoclinic point between Xo and f(xo). In other words, the next trans- 
verse homoclinic point along say lo s after xq cannot be f(xo). 
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Proof If the orientation associated with x 0 is US, then the next homoclinic point 
along u> s , z, has SU orientation, and by Lemma 15.2, z ^ f(xo), as shown in 
Figure 15.25. □ 

Note that this means that there are at least two transverse homoclinic orbits, one 
associated with x 0 and one associated with z. 



Fig. 15.25. The orientation of successive homoclinic points. 


Lemma 15.4 The existence of a single transverse homoclinic point ensures the 
existence of an infinite number of transverse homoclinic points. 

Proof We consider the image of the points within the lobe Lq, which is bounded 
by the stable and unstable manifolds through x 0 and z, as shown in Figure 15.26. 
The image of L 0 is the lobe Li, which is bounded by the portions of w u and u s 
between /(x o) and /( z). Similarly if x € Lq then / J (x) € Lj which is bounded 
by the portions of iv u and u> s between f J (x o) and f J { z). Since we know that 
/ n (x o) — > x and /"( z) — »■ x along a/ as n — > oo, the distance between these 
images must decrease. However, the distance between the points along cj u must 
increase. This leads to long thin lobes, bounded by short sections of w s and long 
sections of w u . We therefore have a finite area covered by an infinite set of lobes 
of finite area equal to the area of Ho (recall that we are assuming that the map 
preserves areas). Consequently, these lobes must overlap. We can assume, without 
loss of generality, that they overlap between z and /(x 0 ). There are, therefore, 
two further transverse intersection points, which we label a and b. We can repeat 
this argument indefinitely, and conclude that there must be an infinite number of 
homoclinic points. LI 

Theorem 15.2 (Smale Birkhoff ) Let f : R 2 — » R 2 be a diffeomorphism with a 
hyperbolic equilibrium point x. //a/(x) and w u (x) intersect transversally at a point 
other than x, in other words, if there is a homoclinic tangle, then the map has a 
horseshoe map embedded within it. 


Note that, since horseshoe maps have chaotic orbits, this theorem shows that the 
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Fig. 15.26. The homoclinic points a and b. 


existence of a homoclinic tangle implies the existence of chaotic orbits. As we 
shall see later, there is an algebraic test, Mel’nikov’s method, that can be used to 
determine whether a system has a transverse homoclinic point. 


Proof We will simply give an outline of the proof here. A more detailed proof, 
which uses the idea of Markov partitioning, is given in Guckenheimer and Holmes 
(1983). Our aim here is simply to convince you that the forward and backward 
maps of the region D , which we define below, intersect. 

Consider a region D that contains a transverse homoclinic point, xo, associated 
with an equilibrium point, x of a map f : R 2 — > R , as shown in Figure 15.27. Then 


(i) f k (D) contains the iterates of / fc (x o) for k £ Z. 

(ii) For k < 0, f k (D) is stretched along the stable manifold and contracted along 
the unstable manifold. 

(iii) For k > 0, f k (D) is stretched along the unstable manifold and contracted 
along the stable manifold. 

(iv) For some backward iterate —q, f~ q (D) = D~ will have a horseshoe shape 
and intersect with f p (D) = D + for some forward iterate p. 

(v) The map /~(p +, d(D + ) = D ~ , which maps the region D + into a horseshoe 
shaped region, D ~ , with overlap between D ~ and D + , is a horseshoe map. 


□ 



15.4 HOMOCLINIC TANGLES 


477 



Fig. 15.27. A horseshoe map arising from the dynamics in a homoclinic tangle. 


15.4.1 Mel’nikov Theory 

We will now describe an algebraic test with which we can determine whether a 
system has a homoclinic tangle, and hence, by Theorem 15.2, chaotic solutions. We 
will focus on two-dimensional Hamiltonian systems perturbed by a periodic function 
of time only, although the method can also be used for perturbed non-Hamiltonian 
systems (see Wiggins, 1988). 

We consider a Hamiltonian system (see Section 9.3.9) with Hamiltonian H = 
H(x,y) and associated differential equations 


x = f(x) 

where f = (—dH/dy,dH/dx) T or, written in component form, 


dx dH dy dH 

dt dy ’ dt dx' 


(15.18) 


We will consider this system under the influence of a small perturbation that is a 
function of space and periodic in time, in the form 


x = f(x) +eg(x,t), (15.19) 

with g(x, t) = g(x, t+T) for some period T > 0, and eCl. We start by considering 
the unperturbed system. 
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15.4.2 Unperturbed System (e = 0) 

As we saw in Section 9.3.9, this system is area-preserving, and its equilibrium 
points are either nonlinear centres or saddles. We suppose that xo is a saddle point 
and that there is a homoclinic orbit x°(£) such that x°(£) -» xq as £ — > ±oo. The 
interior of the homoclinic orbit must contain concentric limit cycles surrounding a 
centre at x*, as shown in Figure 15.28. 



x = x°(t) 


Fig. 15.28. A homoclinic orbit of the unperturbed Hamiltonian system. 


15.4.3 Perturbed System (0 < e <C 1) 

We can think of the perturbed system as autonomous in the three-dimensional 
(x, y, £)-phase space. We can define an associated map by stroboscopically sampling 
the flow at t = nT for n £ Z. Since g(x,£) = g(x, £ + nT), this is equivalent to a 
Poincare return map with £ the plane £ = 0. Note that the equilibrium point, x 0 , 
of the unperturbed system of differential equations is also an equilibrium point of 
the associated map, and that the stable and unstable manifolds of xo are the same 
for the unperturbed system of differential equations and the associated map. 

We assume that the influence of the perturbation is to modify the equilibrium 
point of the map to the point x e , with, since eC 1, |x e — x 0 | <C 1. For the perturbed 
map, the stable and unstable manifolds of the saddle point do not necessarily 
connect smoothly, as shown in Figure 15.29. We would now like to find some 
way of distinguishing between the two cases - either disjoint or tangled manifolds. 
Let to denote the time at which we wish to consider the fate of the solutions. Since 
the unperturbed system is autonomous, its orbits are invariant under arbitrary 
transformations in time, so that the orbits x°(£) and x°(£ — £q) are the same. This 
is not true for the solutions of the perturbed system. In Figure 15.30, we show the 
orbit x°(£) as a dashed line and the stable and unstable manifolds of the saddle 
point of the perturbed system as solid lines. We will now introduce the idea of the 
distance between points on the stable and unstable manifolds of the saddle point. 
The idea is to find an expression for this distance and determine whether it can be 
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Fig. 15.29. The perturbed stable and unstable manifolds of the associated map. 


zero. We can do this by defining the Mel’nikov function, which then provides an 
algebraic sufficient condition for the existence of transverse homoclinic intersections 
and hence chaotic dynamics. 



Fig. 15.30. The perturbed (solid lines) and unperturbed (broken lines) stable and unstable 
manifolds of the saddle point. 


The Mel’nikov function, D(t,to), is proportional to the component of the 
distance between corresponding points on the stable and unstable manifolds in the 
direction normal to the unperturbed homoclinic orbit, and is given by 

D(t, to) = N(t,f 0 ) • d(i,t 0 ). 

Here N (t, to) is a normal to the unperturbed orbit and d(f , to) connects correspond- 
ing points on the stable and unstable manifolds. If D(t,to) has simple zeros, then 
there must be transverse homoclinic intersections, and we can conclude that the 
system has chaotic solutions. 

Firstly, we construct the normal to the unperturbed homoclinic orbit, x°(t). The 
tangent to the orbit is f, and hence N(i, f 0 ) • f (x°(f — 1 0 )) = 0. In component form, 



480 


AN INTRODUCTION TO CHAOTIC SYSTEMS 


iVi/i + N 2 f 2 = 0, where f = (/i, f 2 ), so we can take 

N= (-/ 2 (x 0 (t-to)),/i(x 0 (t-t 0 ))). 

Note that N is not a unit normal. The vector displacement between correspond- 
ing points on orbits x s (t) and x u (t) in the stable and unstable manifolds of x e is 
d(t,t 0 ) = x s (t,t 0 ) — x u (t,t 0 ), so the Mel’nikov function is 

D{t, to) = N(t,t 0 ) • d (t,t 0 ) = -f2di + fid 2 , 

where d(t, to) = (di(t, to), d 2 (t,to)) T . For notational convenience, we now introduce 
the binary operation A such that u A v = mv 2 — v\u 2 , so that D = f A d. We can 
now use perturbation theory to obtain an expression for Z?(t, to). 

We assume that points on the orbits associated with x £ remain close to points 
on the homoclinic orbit x°(t — to), so that we can write their locations as a posi- 
tion on the homoclinic orbit plus a small perturbation proportional to e. We also 
introduce the superscript s,u so that we can discuss the stable and unstable cases 
simultaneously, writing 

x s ’ u (f, t 0 ) ~ x°(t - t 0 ) + exi’ u (t, t 0 ), (15.20) 

and hence 

d(t, to) ~ e (xf (t, t 0 ) - x? (t, t 0 )) 

and 

T~e(fAx s 1 -fAxJ). 

It is now convenient to introduce two subsidiary Mel’nikov functions, _D s (t,fo) and 
D u (t, to), such that 

D ~ eD s (t, t 0 ) - eD u (t, t 0 ), 

where D s ’ u (t,t 0 ) = f Aij 11 . We now substitute (15.20) into the perturbed system, 
(15.19), and obtain 

x° + ex^ = f (x° + ex 3 ’") + eg(x° + ex 3 > u , t) 

= f(x°) + e J D/(x°)x 3 > u + eg(x°, t) + 0(e 2 ), 

where D/(x°) is the Jacobian of the unperturbed system. This equation is au- 
tomatically satisfied at leading order, since x° is a solution of the unperturbed 
equation, whilst at O(e), 

Xi’ u (t, t 0 ) = T>/(x°(t - t 0 ))x^’ u (t, t 0 ) + g(x°(t - t 0 ), t). (15.21) 

Now, differentiating the functions U s,u (t, to), we obtain 

D s,u (t, to) = f A x^’ u + f A x 3 ’ u , 

since there is a product rule associated with the A operator. However, we note that 
f = ^ f ( x °(t - t 0 )) = £>/(x°(t - t 0 ))x°(t - t 0 ), 
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which implies that f = Dff , and hence 

D s ’ u (t, t 0 ) = (Dff) A x®’ u + f A x®’ u . 

Substituting from (15.21) into this equation gives 

D s ’ u (t, t 0 ) - (Dff) A x^ + f A Dfx\’ u + f A g. 

If we now use the identity 

Dff A x + f A Dfx = (V • f)f A x, 

and note that V • f = 0 since the unperturbed system is Hamiltonian, we find that 

D s,u (t,t 0 ) = f Ag. 

Considering the unstable manifold first, we integrate from — oo to to to obtain 

/ to _ pto 

D u (t,t 0 )dt= / fAgdt, 

-oo J — OO 

which gives 

D u (to, to) — D u (— oo, t 0 ) = f f Ag dt. (15.22) 

J — OO 

Now consider the stable case integrated from to to +oo, which gives 


D s (t,to)dt= / fAgdt, 

Jtn 


and hence 


/»oo 

-D s (t 0 ,t 0 ) + -D s (oo,t 0 ) = / fAgdt. (15.23) 

Jto 

We note that as t —> — oo along the unstable orbit and as t — » oo along the stable 
orbit, both solutions tend to x £ , so that D u (—oo,to) = D s (oo,to). Adding (15.22) 
and (15.23) and changing notation slightly so that we replace (to, to) with (to), we 
have 

/ OO 

f Agdt 

-OO 


= -e / [f(x°(t - to)) A g(x°(t - t 0 ), t)] dt. 

Note that 

(i) If D(to) has simple zeros for sufficiently small e, then w u and uA intersect 
transversally, and hence by Theorem 15.2, there are chaotic solutions. 

(ii) If -D(to) is bounded away from zero, there are no transverse homoclinic 
intersections. 

(iii) If we can find one zero of a Mel’nikov function, the dynamics of the system 
ensure that there will be infinitely many zeros. 
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Example 

Consider the perturbed Hamiltonian system given by the forced Duffing equation, 
(15.3), 

x = x — x 3 — e(6x — ycoswf), 


when eCl. For what values of <5, 7 and u> are there chaotic solutions? 
We begin by rewriting the system as 

d ( x \ { y 

x — x 3 — e(6y — 7 cos ojt) 


dt 


(15.24) 


The unperturbed system is 


d 

dt 


This is a Hamiltonian system with 


d_ 

dt 


and Hamiltonian 


H= -y 2 
2 y 


dH/dy 

-dH/dx 

4 1 2 

-x — -x . 
4 2 


1 


The unperturbed system has equilibrium points at (x, y) = (0, 0) and (x, y) = 
(±1, 0), and it is straightforward to show that the origin is a saddle point and that 
the other two equilibria are centres. The unperturbed phase portrait is shown in 
Figure 15.31. We now need to determine the equation of the homoclinic orbit. This 
orbit must pass through the origin, where H = 0. However, H is a constant on the 
orbit, so the homoclinic orbit is given by 

2 2 1 4 

y = x x . 

y 2 

Setting y equal to zero, we find that the homoclinic orbit also meets the x-axis 
where x = ±\/2. We will take y( 0) = 0 and a:(0) = \]2. We can then solve the 
differential equation for the homoclinic orbit and find that x°(t) = v^2secht. This 
gives us 

x°(t — to) = v / 2sech(f — to)> y°(t — to) = — V / 2sech(t — to) tanh(t — f 0 )- 


We can now construct the Mel’nikov function. The vectors f and g are 

y°(t - t 0 ) 

x°(t - t 0 ) - (x°(t - t 0 )) 3 


f(x°(t-t 0 )) = 


and 


g(x°(t-t 0 )) = 


7 cos uit — 6y°(t — to) 


so that their wedge product is 


f A g = y°(t - t 0 ){'ycos u>t - 6y°(t - i 0 )). 
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-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 

Fig. 15.31. The phase portrait of (15.24) with e = 0. 


Now, using the definition of the Mel’nikov function, 

/ OO 

— v / 2sech(t — to) tanh(i — to) 

-OO 


x cos uit + <5V2sech (t — to) tanh(t — to) 
This can be integrated to give 

D(to) = — e \/27r7u;sech sinwt 0 4 

This has simple zeros when 

3V^tt 


dt. 


4 6 
3 


6 = 


provided that 


6 < 


, ( \ . 

7wsech smwtoi 

3\J 27T /7TW\ 

rwsech j . 


This means that there are transverse homoclinic points when this condition is sat- 
isfied, and we can infer that the system is chaotic. Figure 15.32 shows the Poincare 
return map when e = 0.1, S = 0, 7 = 1 and u> = 



484 


AN INTRODUCTION TO CHAOTIC SYSTEMS 


Points in the Poincare return map 
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Successive iterations of the Poincare return map 
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t 


Fig. 15.32. The Poincare return map for the forced Duffing equation when e = 0.1, 6 = 0, 
7 = 1 and uj = | . 


15.5 Quantifying Chaos: Lyapunov Exponents and the Lyapunov 
Spectrum 

We have now seen how to determine analytically when chaotic solutions exist for 
weakly, periodically perturbed Hamiltonian systems. What can we say about other 
nonlinear systems of differential equations? If we can solve such a system numeri- 
cally, and it appears to have chaotic solutions, can we characterize and quantify the 
chaos? In this final section, we will introduce the ideas of the maximum Lyapunov 
exponent and the Lyapunov spectrum, which we can use for this purpose. 


15.5.1 Lyapunov Exponents of Systems of Ordinary Differential 
Equations 

In a chaotic system, we have seen that neighbouring trajectories diverge. In 
fact this divergence is usually exponentially fast. If we can quantify this rate of 
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divergence, we can quantify the chaotic solutions. Consider the system 

rj-v 

- = f(x), (15.25) 

where f : K™ — > R”. Let x(<) be a reference trajectory, and consider a neighbour- 
ing trajectory, y(t), that has y(0) = x(0) + Ax(0). As t — > oo, and the trajectories 
diverge, we expect that Ax(t) = y (t) — x(<) ~ Ax(0)e At , as shown in Figure 15.33. 
If A < 0, the trajectories actually converge, whilst if A > 0 the trajectories diverge, 
and A gives a measure of the rate of divergence. 



Fig. 15.33. Neighbouring trajectories diverge exponentially fast. 


Formally, we define the maximum Lyapunov exponent with respect to a 
reference trajectory of a flow as 


A 


max 


lim 

| Ax(0) || — *0 


1, ||Ax 
t 1Og ||Ax(0)||’ 


(15.26) 


where ||x|| = \/)>A xj is the Euclidean norm (see Section A1.2). This gives us 
the basic numerical recipe for computing A max from two neighbouring solutions. 
Recall that the idea which is central to this definition is linearization about the 
trajectory x(f), and consequently we need to ensure that we can indeed linearize by 
considering neighbouring trajectories (Ax(0) — > 0). In this limit, A x(t) is governed 
by the linearized equation 

^ = D/(x(f)) Ax. (15.27) 

Formally A x(t) = ^(Ax(0)) where <j > ^(.) is a linear evolution operator, so that 
</>l(oAx) = a^(Ax). If A max is positive, numerical integration of (15.27) will lead 
to exponentially growing solutions. This can be avoided in a nonlinear system by 
renormalizing. The idea is to integrate forward in time until the two trajectories 
become a given distance apart, and then scale Ax, so that the calculation can 
continue with a trajectory that is within a small distance of the reference orbit, as 
shown in Figure 15.34. One way of doing this is to renormalize after regular time 
intervals r. The maximum Lyapunov exponent is then given by the average of the 
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renormalize 


exponent calculated between each renormalization, 


A 


max 


lim 

n—> oo 


1 

(n + 1 )t 


n 


E lo g 

J=0 


1 1 Ax ? 0)1 1 

II Axj(0)|| ’ 


(15.28) 


provided that we define our renormalization to be 


Ax n (0) = 6 


Axn-i(r) 

||Ax„_i(r)| 


for n = 1, 2, . . . , 


where ^ < 1 and we note that 


= <5. 


Example: The forced Duffing equation 

Consider the forced Duffing equation, (15.3), with e6 = \ and e-y = |, a solution 
of which is shown in Figure 15.2. We can now construct the maximum Lyapunov 
exponent using the MATLAB function 
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function avis = lyapunov(n,tau,delO,y,eqn) 
t=0; ls=zeros(l,n) ; avls=ls; 
del=delO*[l l]/sqrt(2); 


for i=l:n 

tspan = [t t+tau] ; 

[toutl youtl] =ode45(eqn, tspan, y) ; 

[tout2 yout2] =ode45(eqn, tspan, y+del) ; 
delxe= [youtl (end, : ) -yout2(end, : )] ; 
nd=norm(delxe) ; 

ls(i) = log(nd/delO) ; avls(i) = sum(ls) /i/tau; 
del = delO*delxe/nd; 
y = youtl (end, :) ; t = t+tau; 



The arguments of the function lyapunov are n, the number of separate evaluations 
of the maximum Lyapunov exponent, tau = r, delO = 8, y, the vector of initial 
data, and eqn, a handle containing the name of the equation to be integrated. The 
built-in function norm(delxe) calculates the Euclidean norm of the vector delxe. 
The command 


plot (lyapunov (1000 , 1 , 0 . 01 , [0 0] ,@duf f ing) ) 

produces Figure 15.35, which shows how the average maximum Lyapunov exponent 
converges to a value of about 0.1. Since this is positive, neighbouring trajectories 
diverge exponentially fast - an indication that the solution has sensitive dependence 
upon initial conditions, and hence is chaotic. 

One difficulty associated with calculating the maximum Lyapunov exponent in 
this way is that the choice of the direction of the initial displacement, Axo, can 
have an effect. We will see how to overcome this problem in the next section. 


15.5.2 The Lyapunov Spectrum 

Although we now have a way of characterizing the rate of divergence of neigh- 
bouring trajectories, this is rather a blunt tool, and can depend upon the direction of 
the initial displacement from the reference trajectory. For an n-dimensional system 
of differential equations, we can overcome these problems by defining n quantities 
that characterize the growth of linef , area, volume and hypervolume elements. 

Consider the system (15.25). As we saw above, the time evolution of small 
variations Ax(f) about the reference trajectory, x(f), is governed by 

= Df(5t(t)) Ax. 

We can write the solution of this system as 

Ax(f) = M(t) Ax(0), 


f This is just the maximum Lyapunov exponent. 
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Fig. 15.35. The maximum Lyapunov exponent of the forced Duffing equation, estimated 
over 1000 iterations. 


where the evolution operator M(t) is the fundamental matrix. We can construct 
M numerically. The first column of M is Ax^^t), the solution subject to the initial 
condition Ax^^O) = (1,0,... , 0) T . The j th column is Ax( J ) subject to the initial 
condition Ax-DfO) = (0, . . . ,1,0,... , 0) T where the one is in the j th position. By 
using this construction we find that 

Ax^(t) = M(t) Ax^(0), 


with 


M(t) = (AxW(t),Ax< 2 )(t),... ,Ax (n >(f)). 


The fundamental matrix M(t ) has n eigenvalues {?7ij(f)} and the Lyapunov 
exponents are defined as 

A i = lim - log Irodf)! for * = 1,2,... ,n. (15.29) 

tr* oo t 

We say that the set of n Lyapunov exponents is the Lyapunov spectrum. The 
exponents can be ordered as Ai ^ A 2 ^ ^ A ra , and Ai = A max , the maximum 

Lyapunov exponent. As before, if one of the exponents is positive, this is a sign of 
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sensitive dependence upon initial conditions, and we need to use a renormalization 
scheme to calculate the Lyapunov spectrum. 


Example: The cubic crosscatalator 

We can calculate the Lyapunov spectrum of the cubic crosscatalator equations, 
(15.10) using the MATLAB function 


function avlambda = lyapunovspectrum(n,tau,delO,y,eqn) 
t=0; avlambda = zeros(3,n); lambda = avlambda; 
ex = [10 0]; ey = [0 10]; ez = [0 0 1]; 


for i=l:n 

tspan = [t t+tau] ; 

[tout yout] =ode45(eqn, tspan, y) ; 

[toutx youtx] =ode45(eqn, tspan, y+del0*ex) ; 
[touty youty] =ode45(eqn, tspan, y+del0*ey) ; 
[toutz youtz] =ode45(eqn, tspan, y+del0*ez) ; 

delx= [youtx (end, : )-yout (end, : )] /delO ; 
dely= [youty (end, : )-yout (end, : )] /delO ; 
delz= [youtz (end, : )-yout (end, : )] /delO ; 

m = eig([delx; dely; delz] ) ; 
lambda(:,i) = log(abs(m)); 
avlambda( : , i) = sum (lambda, 2) /i/tau; 
y = yout(end,:); t = t+tau; 



The built-in function eig(A) calculates the eigenvalues of the matrix A. Figure 15.36 
shows the estimates of the three Lyapunov exponents converging over 1000 itera- 
tions. One of these is positive, which indicates that the solution depends sensitively 
upon initial conditions. Since the other two elements of the Lyapunov spectrum 
are negative, this indicates that neighbouring solutions diverge in one direction and 
approach each other in the two perpendicular directions. 

We should note that there are some difficulties associated with calculating the 
Lyapunov exponents numerically. There may be different sets of exponents in dif- 
ferent regions of phase space. This means that the Lyapunov spectrum may depend 
upon the starting condition. It is also notoriously hard to obtain convergence for 
Lyapunov exponents, especially where there are regions characterized by rotation, 
such as in the neighbourhood of centres. 

There are several further points that we can make about the Lyapunov spectrum. 
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Fig. 15.36. The Lyapunov spectrum of the cubic crosscatalator equations. 


We can also define the TV-dimensional Lyapunov exponent, 

N 

A n = X 3 

3 = 1 

This allows us to consider the growth of various elements, so that line elements 
grow like e Alt , area elements grow like e A2t , and so on. For the cubic crosscata- 
lator, we can see that Ai and A 2 are positive, so that line and area elements 
expand, but that A 3 is negative, so that volume elements contract. 

- The Lyapunov exponents of an equilibrium point are just the real parts of its 
eigenvalues. To see this, note that, in the neighbourhood of an equilibrium point, 

x = x*, 


^ = D/(x*)Ax, 

with Z)/(x*) a constant matrix. This means that the fundamental matrix is 
M = exp (_D/(x*)i) (see Section 14.3.2). If ji, are the eigenvalues of D/(x*), 
then the eigenvalues of M are e IMt . On substituting these into the definition 
(15.29), we obtain A,; = Re(/Zj). 
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- Every point in the basin of attraction of an attractor has the same Lyapunov 
spectrum. 

- Lyapunov exponents characterize average rates of expansion (A; > 0) and con- 
traction (Aj < 0) in phase space. For conservative systems, ]C" =1 = s i nce 

the determinant of the Jacobian is unity, which means that the product of its 
eigenvalues is unity, and hence the sum of the Lyapunov exponents is zero. A 
dissipative system, for which volume elements contract, has Xo=i < 0. 

Finally, we note that if a dynamical system is chaotic then at least one Lyapunov 
exponent is positive. 


Exercises 


15.1 Write down the equations that govern the concentrations of the chemicals 
involved in the cubic crosscatalator scheme, (15.4) to (15.9). After defining 
suitable dimensionless variables, derive (15.10), noting carefully the con- 
ditions on the initial concentration of P and the rate at which P decays 
under which the equations are a good approximation. 

15.2 Determine the period 2 points of the map 

f : x i— > 4x|l. 


15.3 Consider the map 


H( x) = 


3x, 


for x £ [0, \ 


— 2 + 3x, forxe(|,l]. 

Prove that the only points that remain in [0, 1] have the form 


= with a n € {0, 2}. 


n= 1 


If a(x) = aia 2 d 3 . . . , with a n € {0, 2}, show that 

a(H(x)) = a(a(x)), 


where <ra = b if b n = a n +\. For what value of x is a(x) = 002002002002 . . .? 
Show that this point is periodic of period 3 under H. 

15.4 In order to determine the n th roots of a, we can try to determine the zeros 
of f(x,a) = x n — a using the Newton-Raphson iteration. This is given by 

x i+l — x i f! I \ • 

f (Xi) 

(a) Write down the map for the iterates in the determination of the 
roots, and show for n = 2 that 


1 

Xi+1 ~ 2 
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(b) Show that, for general n, the fixed points of the map are a 1 /", and 
determine the stability of these points by considering Xi+e for £<1. 
Do you think that this is a good way of determining the n th roots 
of a? 

15.5 Determine the fixed point of the map 

Un+i = 77 — for y„ > 0, 

1 + Un 

and show that with y n = x n /x n +i, x n are the Fibonacci numbers. 

15.6 Consider two maps f and g such that f(x) : x i— > Ax and g(x) : x i— > Bx, 
where x = (x,y) T , and A , B are real 2x2 matrices. 

(a) Determine a condition for f(x) to have a fixed point other than 
the trivial one. Comment on the fixed points of the composition 
of f(g(x)) and g(f(x)), stating a condition for which these are the 
same set of points. 

(b) If det(A) = det(.B) = 1, and hence the corresponding maps are area- 
preserving, comment on the properties of the composition f(g(x)). 
Discuss the different options for f and g given that the maps are 
area-preserving. 

(c) Consider 



and determine the unstable manifolds, where applicable, of the maps 
f, g and f(g). 

15.7 Express the system 

x = x 3 + xy 2 — x — 2 y, y = yx 2 + y 3 — y + 2x 

in terms of the polar coordinates (r, 9) and hence calculate the Poincare 
return map P : R. — > R as the map of successive intersections of the orbit 
x(f) with the positive y-axis. Show that orbits starting on the y-axis outside 
the circle r = \J e 2 * / (e 2n — 1) never return to the y-axis. 

15.8 After defining a suitable plane E, use MATLAB to calculate a Poincare 
return map for (i) the forced Duffing equation and (ii) the cubic crosscata- 
lator equations. 

15.9 The equation of motion of a forced simple pendulum is 

d 2 9 . 

— — + sin 9 = e( a + 7 cos cot ) , 
dP 

where a, 7 and e are positive constants. 

Show that if e = 0 then there is a pair of heteroclinic orbits connecting 
saddle points of (±77 0) in the ( 9 , 0)-plane, where <j> = d9/dt. Deduce that 
one of the orbits is given by 

9q (t) = 2tan _1 (sinht), </> 0 (t) = 2sech(t). 
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Show that the Mel’nikov function for the perturbation problem of inter- 
section near (0,2) of the unstable manifold from (— 7r, 0) and the stable 
manifold to (7T, 0) can be expressed as 

OO 

M(to) = j (j>o{t — to) (a + 7 coswt) dt, 

— OO 

and deduce that 

M(t 0 ) = 2 tt ja + 7sech cos(wt 0 )| • 

Hence show that there is chaos for small e if 7 > a cosh( P 2 tux). 

15.10 Arnol’d’s cat map maps the torus T 2 = R 2 /Z 2 to itself, and is given by 
x„ + i = f(x n ), where x n = (x ni y n ) and 

f (a;, y) = {x + y modulo 1, x + 2y modulo 1). 


15.11 

15.12 


Show that this map is area-preserving. Find its Lyapunov exponents. 
Calculate the Lyapunov spectrum of the Lorenz equations. 

Project The two-dimensional motion of a particle in a flow with stream 
function ip = ip(x,y,t) is governed by the equations 


dip . dip 
X= "fry' y = ~lhc' 


(E15.1) 


Consider the motion of a particle under the influence of an impulsive Stokes 
flow, for which momentum is negligibly small, and ip = ip*(x,y)6(t — to)- 
We can integrate (E15.1) to show that 


xi ~ x 0 = 


dip* 

dy 


fro,yo) 


2/1 — 2/0 = — 


dip* 

dx 


fro,yo) 


(E15.2) 


where (xo, yo) is the position of the particle before the impulse and (xi,yi) 
its position after the impulse. 


(a) Show that the map defined in (E15.2) is not area-preserving, but 
that the map 


xi-x 0 = 


dip* 

dy 


fro,yi) 


2/1 — 2/0 = - 


dip* 

dx 


from) 


(E15.3) 


is area-preserving. 

(b) Write a MATLAB script that iterates points forward under the in- 
fluence of the pulsed flow 


i’ix, y,t) = y) s (t ~ n )> 

n— 0 

where the stream function is that associated with a point force at 
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(x,y) = ( d,h ) above a solid plane at y = 0. This is known as a 
Stokeslet, and its stream function is given by 

i>*d,h(x,y) = 


a(x 


d) 



{x — d) 2 + (y + h) 2 1 2 hy 

(. x — d) 2 + (y — h ) 2 J (x — d) 2 + (y + h) 2 


Here a is the strength parameter (see Otto, Yannacopolous and 
Blake, 2001, for more details). Use the area-preserving map (E15.3). 
This will give a stroboscopic plot of the trajectory of the point. 

(c) Consider the effect of alternating Stokeslets at (0, |) and (0, |) with 
the same strengths. Show that this leads to chaotic dynamics. This 
is similar to Aref’s blinking vortex which leads to chaotic advection 
(Aref, 1984). 

(d) Now consider the flow associated with Stokeslets that are not on a 
vertical line, for instance (— g, |) and (g, g) or (— g, |) and (g, |). 

(e) By constructing the Jacobian associated with the flow, determine 
the nature of any fixed or periodic points, and determine the Lya- 
punov exponents associated with the flow. 

(f) This model can be extended to other flows (see Ottino, 1989). In- 
vestigate the combination of other fundamental solutions of Stokes 
flow. 
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Linear Algebra 


Al.l Vector Spaces Over the Real Numbers 

Let V be a set of objects called vectors, of the form V = {. . . , x, y,z,. . . }, and 
let R denote the real numbers, or scalars. The set V forms a vector space over 
R if, for all x, y, z £ V and a, f3 £ R, 

(i) x + y = y + x, 

(ii) (x + y) + z = x + (y + z), 

(iii) x + 0 = x, 

(iv) x + (-x) = 0, 

(v) a(x + y) = ox + ay, 

(vi) (a + /3)x = ax + /3x, 

(vii) (a/?)x = a(/3x), 

(viii) lx = x. 

These conditions are the familiar laws of commutativity, associativity and distribu- 
tivity for the vectors and scalars, together with the existence of inverses and iden- 
tities for the scalars and vectors. 

Examples 

(i) V = R™, so that 



( X1 \ 


/ y -i \ 


X2 


yi 

X = 

V x n ) 

. y = 

\Vn ) 


are n-dimensional vectors that can be written in terms of their coordinates. 
If we then define vector addition and scalar multiplication by 



/ xi+yi \ 


/ axi \ 


X2 + 2/2 


otx 2 

x + y = 

V x n +y n y 

, ax = 

\ OLX n ) 


it is straightforward to verify that R" is a vector space over R. 
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(ii) V = P n = {a n x n + On-iX™” 1 + • • • + a\X + a 0 | a„eK, a: € [a, /?]}, so 
that V is the set of polynomials of degree n with domain x € [ot,(3\. A 
typical member of V would be an object of the form 6x 3 — 2x + 1. If we 
define vector addition and scalar multiplication by 

(f + g)0) = f(x) + g(x), (af)(x) = af(x), 

for x £ [a,P\, and the zero function by 0(x) = 0, it is again easy to verify 
that V is a vector space over R. 

A subset B = {bi. b 2 , . . . , b„} of a vector space V is said to be linearly inde- 
pendent if oubi + 0^2 + • • • + a n h n = 0 implies that a.\ = a 2 = • • • = a n = 0. 
If auxi + a 2 x 2 + • • • + a n x n = 0, and the cp are not all zero, we say that the set 
of vectors xi,X 2 , . . . ,x n is linearly dependent. 

The subset B forms a basis for V if, for every x £ V, we can write x as a linear 
combination of the elements of B , so that x = Oi bj + a 2^>2 + • • • + a n b n , for 
some cti £ M. The set of all linear combinations of bi, . . . , b„ is called the span of 
these vectors. If span (bi, . . . , b n ) = V, then bi, . . . , b„ form a basis for V. 

A vector space V is finite dimensional if it has a basis with a finite number of 
elements. If it is not finite dimensional, and it has a basis with an infinite number 
of elements, it is said to be infinite dimensional. 


Examples 

(i) Consider V = R”. The subset 



forms a basis, since 



xibi + X 2 h 2 + ■ ■ ■ + x n b n 


There are other bases for R”, but this is the simplest. All other bases also 
have n elements. 

(ii) Consider the vector space 


a n cos nx 


l, X £ [ — 7T, 7r] 


which consists of convergent Fourier series defined on —n ^ x ^ u. A basis 
for V is B = {1, cos x, cos 2x, . . . }, which contains an infinite number of 
elements. This shows that V is infinite dimensional. 
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A1.2 Inner Product Spaces 

The inner or dot product of two elements, x = (x\, x 2 , ■ ■ ■ , x n ) and y = (y 1 , y 2 , ■ ■ . , y n ), 
of R" is defined to be 


(x, y) = xiyi + x 2 y 2 H 1- x n y n . 

For a general vector space, V over R, the inner product is a mapping, ( . , . ) : 
V x V — > R, with the three properties 

(i) (x,x) > 0, 

(ii) (x, x) = 0 if and only if x = 0, 

(iii) (ax + /?y,z) = a(x,z) + /3(y,z), 

for x, y, z € V, a, /3 € R. 

A vector space with an inner product defined on it is called an inner product 
space. An important example is the space of all real- valued functions, C(I ), defined 
on an interval I = [a, b } . It is straightforward to confirm, using the properties of 
the Riemann integral, that 

{f,9)= [ f( x ) g{x) dx 

J a 

is an inner product. For example, if I = [—1,1], f{x) = x and g{x) = x 3 , then 
(f,g) = f\x 4 dx = §. 

Two nonzero vectors, x and y, in an inner product space are said to be orthog- 
onal if (x,y) = 0 . A set of nonzero vectors in an inner product space, {x^} for 
i ^ 1 , whose members are mutually orthogonal is necessarily linearly independent. 
To see this, suppose that ctiXi + <22X2 + • • • + a„x„ = 0 for 01,02, . . . , a n £ R. 
This means that Oi(xi,x J ) + a 2 {x.2 ,x^) + • • • + o n (x n ,Xj) = aj{x.j,Xj) = 0 , and 
hence, by property (ii) above, o j = 0 for j ^ 0 . 

A norm, ||x||, of a vector x must have the four properties 

(i) | j x 1 1 is a non-negative real number, 

(ii) | |x| | = 0 if and only if x = 0, 

(iii) \\kx.\\ = \k\ | |x| | for all real k, 

(iv) ||x + yj| ^ j jx| | T | |y 1 1 , the triangle inequality. 

The Euclidean norm of a vector is defined to be ||x|| = \J (x, x), and gives the 
size or length of x. This is familiar in R 3 , with (x, y) = Xiyi + x 2 y 2 + ^ 32 / 3 , so that 
| |x 1 1 = \/ x\ + x\ + £ 3 . In C(I), there are different types of norm. Using the inner 
product discussed above, we can define a norm 

ll/ll = J f 2 ( x ) dx. 

A useful relationship between the inner product and this norm is the Cauchy— 
Schwartz inequality, 

|(x,y)| < ||x|| ||y||. 
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We can also define the sup norm through 

ll/ll = sup {f(x) I X € I}. 

The function space C(I) is complete under the sup norm.f This can be useful 
when proving rigorous results about differential equations. 

A1.3 Linear Transformations and Matrices 

A transformation, T, from a vector space, V, into itself, denoted by T : V — » V, is 

linear if 

(i) T(x + y) = T(x) + T(y) Vx, y € V, 

(ii) T(Ax) = AT(x) VA G K, x e V. 

It follows immediately from this definition that T(Ax + /iy) = A T(x) + pT( y) and 
T(0) = 0. 


Examples 

(i) If V = R 3 and T : R 3 -» R 3 is defined by 



then 

/ / X! + yi \ \ / x 2 + y 2 \ 

I 1 (x + y) = T I x 2 + y 2 ] ] = I x 3 + y 3 I =T(x) + T(y), 

\ \ x 3 + y 3 ) ) V x 1 + yi ) 

( ( Xxi \ \ ( Xx 2 \ 

T(Ax) = T \ Xx 2 = Ax 3 = AT(x), 

\ \ Xx 3 J J \ X Xl ) 

so that T is a linear transformation. 

(ii) If V = P n , the vector space of polynomials of degree n, and 

T(a n x n + • • • + a\X + do) = na n x n_1 + • • • + ai, 

T is a linear transformation, and can be identified with the operation of 
differentiation. 

Linear transformations can be represented by considering their effect on the 
basis vectors. If T(bj-) = ot\j bi + a 2 jb 2 + ■ ■ ■ + a n jb n , and we take a general vector 
x = Aibi + A 2 b 2 + • • • A„b„, then 

T(x) = AiT(bi) + A2T(b 2 ) + • • • + A ra T(b„), 


— Ai(o!iibi + «2ib2 + • • • + a n ib n ) + • • • + A ra (ai„bi + «2nb2 + • • • + a nn b n ), 
f See Kreider, Kuller, Ostberg and Perkins (1966) for a discussion of completeness. 
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which, using the standard definition of matrix multiplication, we can write as 


T 


We say that the matrix 


^ Ai \ 

/ 

A2 

— 

l An y 

V 


A = 


OL 21 022 

\ OL n \ &n2 

/ an ai2 
a 21 a 22 

\ 1 ®n2 


n 

&2n 


&nn 


\ 

/ Ai \ 


a 2 

/ 

\ J 


ain \ 

&2n 

(%nn / 


is a representation of the transformation. In example (i) above, 


so that 




A = 


0 1 0 \ 
0 0 1 
10 0 / 


A 1.4 The Eigenvalues and Eigenvectors of a Matrix 

If A is an n x n matrix and x is an n x 1 column vector, the eigenvalues of A 
are defined as those values of A for which the equation Ax = Ax has a nontrivial 
solution. For each of these values of A, the eigenvectors of A are the corresponding 
values of x. This defining equation can be rearranged into the form (A — A/)x = 0, 
where I is the nxn identity matrix, for which the condition for nontrivial solutions 
is det(A — XI) = 0. This is known as the characteristic equation associated with 
the matrix A. 


Example 

Find the eigenvalues and eigenvectors of the matrix 


A = 


4 -1 
2 1 


The eigenvalues satisfy 


det 


4 - A -1 
2 1 - A 


= 0. 


This gives A 2 — 5A + 6 = 0so that A = 2 or 3. The eigenvectors satisfy 


4 -1 
2 1 


= A 
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For A = 2, 2x — y = 0, so that the eigenvector is a (1, 2) T for any nonzero real a. 
Here, the superscript T denotes the transpose of the vector. For A = 3, x — y = 0 
so that the eigenvector is /?(1,1) T for any nonzero real /3. The eigenvectors are 
linearly independent and span R 2 . 


Example 

Find the eigenvalues and eigenvectors of the matrix 


A = 


0 0 1 \ 

0-10 

2 2 1 / 


The eigenvalues satisfy 



= 0 , 


so that (A — 2) (A + l) 2 = 0, and hence A = 2 or —1. The associated eigenvectors 
are a(l,0, 2) T corresponding to A = 2, and /5(1 , 0, — 1) T corresponding to A = — 1. 
These eigenvectors span a two-dimensional subspace of R 3 . 


A particularly useful result concerning eigenvalues is that, if an n x n matrix 
A has n distinct eigenvalues Ai,A 2 , ... , A„, there exists a nonsingular matrix B , 
whose columns are the eigenvectors of A, such that 


B 1 AB = diag(Ai,A 2) ... ,A„) = 


/ Ai 0 ... 0 \ 

0 a 2 0 ... 


V 0 0 ... A n J 


The only nonzero entries of this matrix are on its diagonal, and are the eigenvalues 
of A 

Consider the system of differential equations x = Ax, where x = (sq, aq, . . . , x n ) T 
and a dot denotes differentiation. If we write x = B y, the system changes to 
y = B~ 1 ABy 1 which is considerably easier to analyze if B _1 AB is diagonal, since 
the differential equations are all decoupled. 


Example 

Let’s try to find a simplification of the system of differential equations 


Xi 

~ 

X3, 

X2 

= 

Xl, 

X3 

= 

2xi + X 2 

We can write this in matrix form as 

X = 

Ax with 


( 0 

0 

1 \ 

A = 

1 

0 

° 


V 2 

1 

0 / 
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We need to simplify the structure of the matrix A. To do this we firstly find the 
eigenvalues, A = —1, |(l + \/5) and 1(1 — \/5), and the corresponding eigenvectors, 


1 

-1 

-1 



( -|(!-V5) ) , 

V |(! + \/5) / 


-|(1 + V5) 

|(1 - x/5) 


These are orthogonal, and hence linearly independent, and form a basis for R 3 . 
After choosing our matrix B to be 


with inverse 


we find that 


B~ 1 AB = 


B = 

1 

-1(1 -V5) 


V-1 

i(i + VE) 


( 1 

l 

B~ 1 = 

1 

\/5 

2jg( 3 


V 

-57S( 3 + ^) 

-1 

0 

0 \ 

0 |(1 + V5) 

° 

0 

0 

|(1 ->/5) ) 


— 1(1 + \/ 5 ) 

i(l - A) 

-57SI 1 -v'S) 
+ ( 1 + v'5> 


= diag(-l, t(l + \/5), t(l - C5)) 


is the representation of A with respect to the basis of eigenvectors. The transformed 
system of differential equations therefore takes the form 

yi = -yi, V 2 = \{l + V5)y 2 , 2>3 = ^(1 ~Vfi)yz- 


These have the simple solutions y\ = ci exp(— t), y 2 = C 2 exp{l(l + \/5 )t}, yz = 
C 3 exp{^(l — \/5)f}- Finally, the solution is x\ = y\ + y 2 + yz, x 2 = — y\ — |(1 — 
\fh)y 2 - \{l + V§)yz and x 3 = -yi + |(1 + V5)y 2 + - \/§)yz, in terms of the 

original variables. 

Finally, we will often make use of the Cayley— Hamilton theorem. 


Theorem A 1.1 (Cayley— Hamilton) Every square matrix A satisfies its own 
characteristic equation, det(A — XI) = 0. 

The proof of this theorem is rather involved, and we will not consider it here (see 
Morris, 1982). The Cayley-Hamilton theorem shows that A k , with k ^ n, can be 
written as a linear combination of I, A, A 2 , . . . , A n_1 . 



APPENDIX 2 


Continuity and Differentiability 


Let / be a real-valued function defined on some open interval that contains the 
point x = c. We say that 


/ is continuous at x = c if and only if lim f(x) = /(c). 

X — >C 

The function / is continuous on the interval (a, b) if and only if it is continuous for 
all x £ (a, b). The set of all continuous functions / : (a, b) — > R (R denotes the set 
of real numbers) forms a vector space denoted by C(a,b) or C°(a,b). 

A function / is continuous on [a, b] if it is continuous on (a, b) and 


lim /( x) = /(a), lim f(x) = /(&). 


In other words, there is no singular behaviour at the ends of the interval. We will 
use C[a, b] to denote the vector space of continuous functions / : [a, b] — > R. 

An alternative, but equivalent, definition of continuity is that / is continuous 
at a point x = Xq if, for every e > 0, there exists a function S(x o,e) such that 
| f{x) — /(xo)| < e when \x — Xo| <6. If 6 depends upon e but not upon xq in some 
interval I, we say that / is uniformly continuous on I. It can be shown that, if 
a function is continuous on a closed interval, then it is also uniformly continuous 
there. 

If lim^^ c + f{x) yf lim aMC - /(x), f(x) is not continuous at x = c. There is an 
important class of functions of this form, for which there are a finite number of 
discontinuities, at each of which fix') jumps by a finite amount. These are known 
as piecewise continuous functions, and, like the continuous functions, form a 
vector space, which we denote by PC {a, b). 

If the limit of {f(x) — /(c)} /{x — c) exists as x — > c, we say that / is differen- 
tiable at x = c, and denote its derivative by 


/'(c) = lim 

X — >C 


fix) - /(c) 
x — c 


The set of all once-differentiable functions on (a, b) forms a vector space which we 
denote by C 1 (a, b) . We can make similar definitions of C 2 (a, b ) , C 3 (a, b ) , . . . , C n (a, b ) , 
. . . , C 00 ^, 6), in which the higher derivatives are defined in terms of limits of deriva- 
tives of one order lower. 

Examples of functions that live in these spaces are: 


(i) / : (0, 27 t) — » R : f(x) = sinx. Since f{x) = cosx and f"(x) = 


— sin x = 
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this function can be differentiated as many times as we like and the 
result will be a continuous function. We conclude that / € C°°{ 0,27 t). 

(ii) / : (—1, 1) — > R : f(x) = x 2 for x ^ 0, f(x ) = 0 for a: < 0. This function 
can be differentiated twice, but, since f"(x) = 0 for x < 0 and f"(x) = 2 for 
x ^ 0, no more than twice. We conclude that / € C 2 (— 1, 1). 

(iii) / : R — ► R : f(x) = iff sin (n!) 2 x - This function, which is illustrated 

in Figure A2.1, is continuous on R but nowhere differentiable, so that / € 
C°(R), but / ^ C ra (R) for any n > 0. 






Fig. A2.1. The continuous, nowhere differentiable function f(x) = sin (n!) 2 x - 


(iv) / : [0, 3] — » R : 


fix) 


x 2 for 0 < x < 1, 
cos x for 1 < x < 2, 
e~ x for 2 < x < 3, 


(A2.1) 


which is plotted in Figure A2.2. This function is discontinuous at x = 1 and 
x = 2, and therefore / € PC[0,3]. 
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Fig. A2.2. The piecewise continuous function given as example (iv). A cross indicates the 
value of the function at a point of discontinuity. 

If / is differentiable at c € (a, b), it is continuous at c (but the converse of this is false, 
see example (iii) above) so that C° (a, b) D C 1 (a, 6) D C 2 (a, b) ■ ■ • D C n (a, b) ■ ■ ■ . 

We conclude this section with two useful theorems that you should recall from 
your first course in real analysis. 

Theorem A2.1 If f : [a, b] — » R and f £ C[a,b ], then 3 1< such that \f(x)\ < 
K \/x G [a, b}. In words, a continuous function on a closed, bounded interval is 
bounded. 


Theorem A2.2 (The mean value theorem) If f : ( a,b ) — > R and f £ C l (a,b), 
then 3c £ ( a,b ) such that f(b) — f(a) = (b — a)f'(c). 




APPENDIX 3 


Power Series 


Many commonly encountered functions can be expressed as power series, for exam- 
ple, 

~3 „,5 00 _2n+l 

n= 0 

™2 4 °° ~2ra 

cosa: = l - + . = £(-!)" (2^! • ( A3A ) 

In general, we can develop a power series representation for a sufficiently differen- 
tiable function, provided that it involves no singularities, fractional powers of x or 
logarithms. 

A3.1 Maclaurin Series 

The Maclaurin series is a power series about x = 0. If / G C°°(a,b) for some 
a < 0 < b, the series takes the form 

2 n 

/( X) = /( 0) + xf'( 0) + |-/"(0) + • • • + + • • - 

Here, a prime denotes a derivative with respect to x and /*■"' (x) is the n th derivative. 

Example 

Let’s determine the Maclaurin series expansion for the function f{x) = 1/(1 + x). 
Since 

f( x ) = YX ’ /(°) = !> 

1 + x 

/» = -(nbr /'<») = - 1 ’ 

= ™ = 2 - 
/ <3> W = ^. /«'(0) = -3!. 

/<”’(*) = . /‘“’(o) = 
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the Maclaurin series is 


/ 0 ) 


= 1 + (-1)* + 2^ + (-1)3!^ + • • • + (-l)"n!^ + • • • 

OO 

= 1 — x + x 2 — x 3 + -- - + (- l) n x n + ■■■ = ^(-l) n x ra . 

n=0 


A3. 2 Taylor Series 

The Maclaurin series is a special case of the Taylor series, which is a power series 
about x = Xq. Its existence also requires reasonable behaviour of the function at 
x = Xo, and for a C°° function takes the form, 

f(x) = f(x 0 ) + {x - xo)f(xo) + — /"(so) H h — — ^ f {n \x 0 ) H . 

2! ?r! 


Example 


Let’s determine the Taylor series of the function /(x) = 1/(1 + x) at the point 
xq = 1. Since 


/O) 


1 

1 + x 1 


/'(*) = ^ 
f"(x) = 


1 

(1 + x) 2 ’ 
2 

(1 + x) 3 ’ 




-3 • 2 
(1 + x) 4 ’ 


fix o) = 
/'(so) = -^ 2 , 

/"(so) = Jr, 

/ (3) (z 0 ) = 


/( " )(a) = (i + x)^i » / (B) (*o) = (-ir 


n\ 

2«+i ' 


the Taylor series is 


= v (* = i ix-m-iy 

^ 71 ] On+l o / 9 n 


n — 0 


(A3. 2) 


n— 0 


A3. 3 Convergence of Power Series 

In order to determine for what values of x a power series is convergent, we can 
usually use the ratio test, which says that 

oo ( converges if lim^oo |a n+ i/a n I < 1, 
the series X) a n 

n = 0 


diverges if lim^oo \a n+1 /a n \ > 1. 
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In the previous example, (A3. 2), 

„ Or-lH-1 r 


2 n+1 

so that, for convergence, we need 

lim 


a n + 1 — 


On+l 


x — 1 

a n 


2 


(x - l)" +1 (-l) n+1 

2 n+2 ' 


< 1. 


This means that the series converges when — 1 < x < 3. This defines the domain 
of convergence of the power series (A3. 2). This is often written as \x — 1| < 2, 
which defines the radius of convergence of the series as two units from the point 
x = 1. Notice that the function f(x) is singular at x = — 1, and will not have a 
Taylor series about this point. It is worth pointing out that the Taylor series of a 
polynomial is a terminating series and hence has an infinite radius of convergence. 

A useful composite result from the theory of Taylor series is that if f(x) = 
Y^=o a n(x ~ Xo) n is a Taylor series with radius of convergence R (it converges for 
| a; — So| < R), f is differentiable for all x such that \x — s 0 | < f?, and f'(x) = 
na n(x — so)"” 1 has the same radius for convergence as the series for /(s). 
This ‘term by term’ differentiation result allows us to develop a method for solving 
differential equations using power series - the method of Frobenius, which we discuss 
in Chapter 1. 

Note that another useful test for the convergence of a series is the comparison 
test, which says that 

if YnL o a n and bn are ser i es of positive terms then 

(i) if Y^=o a « converges and b k ^ a k for k > ko, X)«°=o ^ « a ^ so converges, 


(ii) if Y^L o a n diverges and b k ^ a k for k > k 0 , Y^Lo bn also diverges. 


A3. 4 Taylor Series for Functions of Two Variables 


If / = f(x, y) is a scalar field that has sufficient partial derivatives at the point 
(xo, yo)t the Taylor series about this point is 


df 

f(x, y) = f(x o, 2 / 0 ) + {x - a’o) ^ 


{x - x 0 ) 2 d 2 f 

2 ! dx 2 


(®o,2/o) 

d 2 f 


, . df 

+ (:u-m}g- v 


(xo,yo) 


+ (x-x 0 ){y-y 0 ) 

(x 0 ,yo) ° xay 


(xo,yo) 


( y-yo ) 2 d 2 f 

2 ! 8y 2 


(xo,yo) 


Example 

Let’s find the first two terms in the Taylor series of e~( x +v about the point (0, 0). 


Of 

dx 


= oi 

(o,o) 0y 


= 0 , 


(0,0) 
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5V = av 2 a 2 / 

d* 2 (o,o) (0,0) ’ dxd V (0,0) 

so that e“( x2+y2 ) = 1 — x 2 - y 2 + ■ ■ ■ . 


We can also generalize the Taylor series from scalar- to vector-valued functions 
of two variables. Let’s define a vector field, 




Using the Taylor series for each component of this vector we can write 


f(x) = f(x 0 ) + J(f)(x 0 )(x - x 0 ) H , 


where 


•^( f )( x o) 


(9f, ) 
ai (xo) 
9g , \ 

W Xo) 


I <->J 


is called the Jacobian matrix. This result proves to be very useful in Chapter 9, 
where we study nonlinear differential equations. Note that the definition of the 
Jacobian can easily be generalized to higher dimensional vector fields. 
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Sequences of Functions 


The concepts that we briefly describe in this section are used in Chapter 5, where 
we discuss the convergence of Fourier series. Consider a sequence of functions, 
{fk(x)} for k = 1,2,... , defined on some interval of the real line, I. We say that 
{fk( x )} converges pointwise to f(x) on I if lim^oo fk(xo) exists for all xq £ I, 
and fk(x o) — > f(x o). For example, consider the sequence {fk(x)} = {x + 1 /k} 
on R. This sequence converges pointwise to f{x) = x as k — > oo. The functions 
fk(x) and their limit, f(x) = x, are continuous. In contrast, consider the sequence 
{fk{x)} = {x k } on [0, 1]. Although each of these functions is continuous, as k — > oo, 
fk(x) converges pointwise to 


f(x) 


0 for 0 < x < 1, 

1 for x = 1 , 


which is not continuous. As a final example, consider the sequence fk(x ), defined 
on x ^ 0, with 


fk(x) 


k 2 x( 1 — kx) for 0 < x < 1/k, 
0 for x > 1/k, 


(A4.1) 


which is illustrated in Figure A4.1. Since fk{x) = 0 for x > 1/k, fk(xo) — ► 0 as 
k — > oo for all Xo > 0. Moreover, fk( 0) = 0 for all k, so we conclude that {fk(x)} 
converges pointwise to f(x) = 0, a continuous function. However, the individual 
members of the sequence don’t really provide a good approximation to the pointwise 
limit, firstly because the maximum value of fk(x) is k/ 4, which is unbounded as 
k — > oo, and secondly because 

/»oo POO 

/ fk{x) dx = 1/6, / f(x) dx = 0. 

Jo Jo 


In order to eliminate this sort of behaviour, we need to introduce the concept of 
uniform convergence. A sequence of functions {fk(x)} is uniformly convergent 
to a function f{x) on an interval of the real line, /, as k — » oo if for every e > 0 there 
exists a positive integer, K(e), which is not a function of x, such that \fk(x)—f(x)\ < 
e for all k ^ X(e) and x £ I. This means that, for sufficiently large k, we can make 
fk(x) arbitrarily close to fix) over the entire interval I. For example, to see that 
the sequence defined by (A4.1) does not converge uniformly to zero, note that, for a 
given value of e, we can only guarantee that \fk{x) — f(x) \ = fk{x) < e for k ^ 1/x, 
which is not independent of x. 
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Fig. A4.1. The sequence of functions given by (A4.1). 


Theorem A4.1 If {fk{x)} is a sequence of continuous functions defined on an 
interval of the real line, I = [a,b\, that converges uniformly to fix), then 

(i) f{x) is continuous 

(ii) for every x £ I 

f k (s)ds= f f(s) ds. 

J a 

A proof of this theorem can be found in any textbook on analysis. 


lim / 

k ^°° Ja 




APPENDIX 5 


Ordinary Differential Equations 


Chapters 1 to 4 are concerned with the solution of linear second order differential 
equations with variable coefficients. In this appendix, we will review simple methods 
for solving first order differential equations, and second order differential equations 
with constant coefficients. 


A5.1 Variables Separable 

We will start by looking at a class of equations that can be rewritten in the form 

9{y) t t =m (A5A) 

These equations can be integrated with respect to t to give 

/ 9 ^% dt= / 

and, using the chain rule, 

J g(y ) dy = J f(t) dt. 


Example 

Determine the solution of the differential equation 

— - + e v cos t — 0, 
at 

subject to the condition that y(0) = 1. Firstly, we convert the equation to the form 
(A5.1), so that 


-ydy 
e y — 

dt 

Integrating with respect to t we have 


cos t. 


— e v = — sinf + C. 

Using the boundary condition we find that C = — e -1 , and rearranging gives the 
solution 

1 \ 


y(t) = log 


e" 1 + sinf 
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A5.2 Integrating Factors 

We now consider the linear, first order differential equation 

^ + P{x)y = Q( x). 

You should recognize this as a type of equation that can be solved by finding an 
integrating factor. If we multiply through by exp { J x P(t)dt }, where t is a dummy 
variable for the integration, we have 

exp | y ^ + exp |y P(t)df| P{x)y = Q(x) exp 


We can immediately see that the left hand side of this is — < exp 


dx 


so we have 


P(t)dt j . 

r x 'v 

P(t)dt \y }>, 


d 

dx 


exp | J P(f)dt| y | = Q(x) exp | J P(f)df | 


This can be directly integrated to give 

1 / = A exp | — J P(t)df| + exp J P(t)dt| J Q(s)exp|y P(t)dt^ ds, 

where A is a constant of integration. Here it is important to notice the structure 
of the solution as y = 2/h + y p where 

Vh = A exp | - y P(f)df j 

is the solution of the homogeneous differential equation, 

^ + F(x)»= 0, 


and 


2 /p = exp | — y P(f)dt| J <5(s)exp|y P(t)dt| ds 


is the particular integral solution of the inhomogeneous differential equa- 
tion 

+ P(x)y p = Q( x). 

The key idea of an integrating factor used here may seem rather like pulling a rabbit 
out of a hat. In fact, the derivation can be performed systematically using the idea 
of a Lie group (see Chapter 10). 

Example 

Let’s find the solution of the differential equation 


dx 


y = e 
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subject to the condition that y( 0) = 0. The integrating factor is 


exp / — 1 dx ) = e 


so that we have 




dx 


We can integrate this to obtain e x y = x + c, where c is a constant. Since y( 0) = 0, 
we must have c = 0 and hence y = xe x . 


A5.3 Second Order Equations with Constant Coefficients 

We will now remind you how to solve second order ordinary differential equations 
with constant coefficients through a series of examples. 


Example 

Solve the second order ordinary differential equation 


d 2 y „dy 

dx 2 dx 


+ ?y = o, 


subject to the boundary conditions that y(0) = 0 and y'{ 0) = 1. 

We can solve constant coefficient equations by seeking a solution of the form 
y _ which we substitute into the equation in order to determine to . This gives 

0 + 3^ + 2 y= ( to 2 + 3m + 2) e mx = 0. 

Since e rnx is never zero, we require ?n 2 + 3?n + 2 = (to + 2) (to + 1) = 0. This gives 
to = — 1 or to = — 2. The general solution of the equation is therefore 


y(x) = Ae~ x + Be~ 2x , 


with A and B constants. The first boundary condition, y{ 0) = 0, yields 


A + B = 0, (A5.2) 

whilst the second, y'{ 0) = 1, gives 

-A-2B = 1. (A5.3) 


We now need to solve (A5.2) and (A5.3), which gives us A = 1 and B = — 1, so 
that the solution is 


y (x) = e~ x - e~ 2x . 

ft is possible to solve this equation by direct integration, but the method we have 
presented above is far simpler. Notice that it is not necessary that both boundary 
conditions are imposed at the same place. Boundary conditions at different values 
of x will just lead to slightly more complicated simultaneous equations. 
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Example: Equal roots 


Solve the differential equation 

d 2 u du 

~ t ~ 2 + 2 — + y = 0, 

dx z dx 

subject to the boundary conditions y( 0) = 1 and y'{ 0) = 0. 

Again we can look for a solution of the form y = e mx . We find that m satisfies 
the quadratic equation m 2 + 2 m + 1 = (m + l) 2 = 0, so that m = — 1 is a repeated 
root, and we have only determined one solution. In fact, the full solution has the 
form 

y(x) = Ae~ x + Bxe~ x . (A5.4) 

The boundary condition y( 0) = 1 gives A = 1. Since 

y'(x) = — Ae~ x + B(—x + l)e~ x , 

y'( 0) = 0 gives — A + B = 0, and hence B — 1. The solution is therefore 
y(x) = e~ x + xe~ x = (x + l)e~ x . 


Example: Imaginary roots 
Solve the differential equation 


d 2 y 

dx 2 


+ 4y = 0, 


subject to the boundary conditions y( 0) = 1 and y'( 0) = 3. 

As usual, we seek a solution of the form y = e mx , and find that m satisfies 
to 2 +4 = 0. This means that m = ±2i. The general solution is therefore 

y(x) = Ae 2ix + Be~ 2ix . (A5.5) 


We could proceed with the solution in this form. However, it may be better to use 
e*“ = cos a + isinct. Substituting this into (A5.5) gives 


y(x) = A(cos 2x + i sin 2x) + B { cos 2 x — i sin 2x) 


= {A + B) cos 2 x + i(A — B) sin 2x = A cos 2x + B sin 2x. 

We have introduced new constants A and B , which we determine from the boundary 
conditions to be A = 1 and B = 3/2. The solution is therefore 

3 

y(x) = cos 2a: + - sin 2x. 


Example: An inhomogeneous equation 
Solve the differential equation 


d 2 y „dy 

dx 2 dx 

subject to the boundary conditions y(0) - 


+ 2 y = e x , 
- y’{ 0) = l. 
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Initially we consider just the homogeneous part of the equation and solve 

d 2 y h 
dx 2 

As we saw earlier, the general solution of this is 


3^+2j/ h = 0. 
ax 


y h (x) = Ae 


Be~ 


We now have to find a function /( x), the particular integral, which, when substi- 
tuted into the inhomogeneous differential equation, yields the right hand side. We 
notice that when we differentiate e x we get it back again, so we postulate that f(x) 
takes the form ae x . Substituting y = /( x) = ae x into the differential equation 
gives 

d X+3 d t+2f = ae x + 3 ae* + 2ae x = e x . 
dx 2 dx 

From this expression we find that we need a = 1/6. The solution of the inhomoge- 
neous equation is therefore 

y( x) = Ae~ x + Be~ 2x + -e x . 

6 

It is at this point that we impose the boundary conditions, which show that A = 5/2 
and B = —5/3, and hence the solution is 

y(x) = -e~ x - -e~ 2x + -e x . 
yy 1 2 3 6 


Example: Right hand side a solution of the homogeneous equation 
Solve the differential equation equation 


d 2 y 

dx 2 


o d V , o 

3— + 2 y = e 
dx 


subject to the boundary conditions y{ 0) = y'( 0) = 1. Following the previous 
example, we need to find a function which, when substituted into the left hand side 
of the equation, yields the right hand side. We could try y = ae~ x , but, since e~ x 
is a solution of the equation, substituting it into the left hand side just gives us 
zero. Taking a lead from the case of repeated roots, we try a solution of the form 
y = f(x) = axe~ x . Since 


f(x) = axe x , f'(x) = a(e x — xe x ) = a(l — x)e x , 


f'(x) = a(— 1 — (1 — x))e~ x = a(x — 2)e -x , 
on substituting into the differential equation we obtain 

a(x — 2)e~ x + 3a(l — x)e~ x + 2axe~ x = 3ae~ x = e~ x , 
which gives a = 1/3. The solution is therefore 

y(x) = Ae"* + Be~ 2x + \xe~ x . 

O 
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We now need to satisfy the boundary conditions, which give A = 8/3 and B = —5/3, 
and hence 

y(x) = -e~ x - -e~ 2x + -xe~ x . 

3 3 3 

This technique is usually referred to as the trial solution method. Clearly it is not 
completely satisfactory, as it requires an element of guesswork. We present a more 
systematic method of solution in Chapter 1. 



APPENDIX 6 


Complex Variables 


The aim of this brief appendix is to provide a reminder of the results that are needed 
in order to be able to invert Laplace transforms using the Bromwich inversion 
integral (6.5), and to understand the material on the asymptotic evaluation of 
complex integrals in Section 11.2. We have had to be selective in what we have 
presented, and note that this appendix is no substitute for a proper course on 
complex variables! A good textbook is Ablowitz and Fokas (1997). 


A6.1 Analyticity and the Cauchy Riemann Equations 

Consider a complex- valued function of a complex variable, /(s), with s = x + iy. 
The natural way to define the derivative of / is 

d L = i im f(s + As)-f(s) 
ds As— >o As 

provided that this limit is independent of the way that As = Ax + iAy tends to 
zero. A function f(s) is said to be analytic in some region, A, of the complex 
s-plane if df /ds exists and is unique in R. 

Theorem A6.1 If the complex-valued function f(s) = u(x,y) + iv(x,y), where 
s = x + iy and u, v, x and y are real, is analytic in a region, R, of the complex 
s-plane, then 

du dv dv du 

dx = dy’ dx = ~dy' ( ’ 

These are known as the Cauchy— Riemann equations. 


Proof By definition, 

dJf = Hm f{s + As) - /(s) 
ds As^o As 

uix + Ax, y + Ay) + iv(x + Ax , y + Ay) — u(x, y) — iv(x, y) 

= lim ^ A ■ 

Ax — *0 Ax + iAy 

Ay— >0 y 

If Ay = 0, 

df u(x + Ax, y) — u(x, y) + i {v(x + Ax, y) — v(x, y)} du dv 

— = hm = — — I- i— , 

ds Ax— *o Ax dx dx 
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whilst if Ax = 0, 

df u(x, y + Ay) — u(x, y) + i {v(x, y + Ay) — v(x, y)} dv du 

— = lim — — = — i — . 

ds Ay — *o iAy ay ay 

These are clearly not equal unless the Cauchy-Riemann equations hold. □ 

Theorem A6.2 If the Cauchy-Riemann equations, (A6.1), hold in a region, R, 
of the complex s-plane for some complex-valued function f(s) = u[x,y) + iv(x,y), 
where s = x + iy and u, v, x and y are real, then, provided that all of the partial 
derivatives in (A6.1) are continuous in R, f(s) is analytic in R. 

We will not give a proof of this here. 


A6.2 Cauchy’s Theorem, Cauchy’s Integral Formula and Taylor’s 
Theorem 

The contour integral of a complex- valued function /(s) along some contour C in 
the complex s-plane is defined to be 

J c f(s)ds = J f(s{t))^(t)dt, 

where s(t) for a ^ t ^ b is a parametric representation of the contour C . By con- 
vention, the integral around a closed contour is taken in the anticlockwise direction. 


Theorem A6.3 (Cauchy’s theorem) If f(s) is a single-valued, analytic function 
in a simply-connected domain D in the complex s-plane, then, along any simple 
closed contour, C, in D, 


[ f(s) ds = 0. 

Jc 


Proof If /(s) = u + iv and ds = dx + i dy, 


[ f(s)ds 


(udx — v dy) + i 


(v dx + u dy) . 


Jc Jc Jc 

If df / ds is continuous in D , then u and v have continuous partial derivatives there. 
We can therefore use Green’s theorem in the plane to write 


f(s) ds 




dxdy + i 




dx dy. 


Since / is analytic in D, the Cauchy-Riemann equations hold, and the result is 
proved. If df /ds is not continuous in D, the proof is rather more technical, and we 
will not give it here. □ 
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Theorem A6.4 (Cauchy’s integral formula) If f(s) is a single-valued, analytic 
function in a simply- connected domain D in the complex s-plane, then 


/(*) 



f{s) 

(s-z) 


ds, 


where C is any simple closed contour in D that encloses the point s = z. 


Proof Let Cg be a small circle of radius 6, centred on the point s 
Cauchy’s theorem, 


which we can rewrite as 


/ 0 ) 

(s-z) 


ds = 



f (g) 
(s-z) 


ds, 


f(s) 

(s-z) 


ds = f(z) 



ds 

(s-z) 



f(s) - f(z) 

(s-z) 


ds. 


Using 


By writing the first of these two integrals in terms of polar coordinates, s = z + 6e lS , 
we find that 



ibe 


i6 


d6 = 27 n. 


We can deal with the second integral by noting that, since f(s) is continuous, 
| f(s) — f(z ) | < e for sufficiently small |s — z\ = 6. This means that 



f(s) ~ f(z) 

(s-z) 


ds 



I f(s) - f(z) | 


\ds\ < 



|ds| = 27 re. 


Since e — > 0 as 6 — » 0, the result is proved. 


U 


An extension of Cauchy’s integral formula is 


Theorem A6.5 If f(s) is a single-valued, analytic function in a simply- connected 
domain D in the complex s-plane, then all the derivatives of f exist in D, and are 
given by 




f(s) 


(s-z) 


n+1 


ds. 


(A6.2) 


where C is any simple closed contour in D that encloses the point s = z. 


The proof of this is similar to that of Cauchy’s integral formula. 


Theorem A6.6 (Taylor’s theorem) If f(z) is analytic and single-valued inside 
and on a simple closed contour, C, and zq is a point inside C, then 

1 00 1 

f(z) = f(z 0 ) + (z- zo)f(zo) + y(z - z 0 ) 2 f"(z 0 ) + ■■■ = Y^ —\( z - z o) n f {n) (z 0 ). 

~ n = 0 

(A6.3) 
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Proof Let 6 be the minimum distance from zo to the curve C, and let 7 be any circle 
centred on zq with radius p < 8, so that, for any point 2 inside 7, \z — Zq\ < p < 8. 
By Cauchy’s integral formula, 


We also have 


f{z) = 2 Vi Sc 


1 t f(w) dw=P I hPi dw , 


w — z 2m I., w — z 


1 1 


W — Z W — Zo — Z + Zo W — Zo 1 — ~ ~° 

u u u W—Zq 


1 

w - Z 0 



(z - Z 0 ) 2 

(w - z 0 ) 2 


1 z - Z 0 (z - Zp) n 

w — Zo (w — Zo) 2 (w — z 0 ) n+1 


and this series is uniformly convergent inside 7, since \z — zq\ / \w — zq\ < 1. This 
means that 


f(z) = 


f M 


2m 


-dw = 

w — z 2m 


f(w) , , If f(w) 

J dw + (z - z 0 ) — / .. 

W - z 0 2m (w - z 0 y 


Mw 


+( "~ "° r 2 hi 


f(w) 


(w - Zo) 


n+1 


dw 


Equation (A6.2) then gives us (A6.3). This series converges in any circle centred 
on zo with radius less than 8. If zo = 0 we get the simpler form 

f( z ) = /( 0) + zf'( 0) + — -z 2 /"(z) + • • • . 

□ 


Theorem A6.7 (Cauchy’s inequality) If /(z) is analytic for |z| < R with 
/(z) = X)rj°=o a nZ n , then, if M r is the supremum o/|/(z)| on the circle |z| = r < R, 
we have \a n \ < M r /r n . 


Proof By Taylor’s theorem, 


a„ 


1 

2iri 



f(z) 

Z n +1 


dz, 


so that 





\f(z)\ 

^»n+ 1 


\dz\ < 


1 M r M r 

— r27rr = 

27 r v n ' ^ r n 


□ 
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A6.3 The Laurent Series and Residue Calculus 


Theorem A6.8 Any function f(s) that is single-valued and analytic in an annulus 
TT ^ | s — So I ^ r 2 for ri < r 2 real and positive has a series expansion 

OO 

/0 s ) = c ™( s ~ s °)"’ 

n—— oo 

which is convergent for tt < |s — Sol < r 2 . This is known as the Laurent series, 
and its coefficients are given by 

c =J- f M ds 

Cn 27 ri Jc (s - s 0 )” +1 

for any simple closed contour that encloses s = s o and lies in the annulus tt ^ 
\s - Sol < r 2 . 

Although we quote this important theorem, which underlies the techniques of 
residue calculus, we will not give a proof here. We do note however that 

/ f(s)ds = 2nic-i. 

Jc 

For an analytic function, c_i =0, and we have Cauchy’s theorem. If / is not ana- 
lytic within C , this result shows that we can calculate the integral by determining 
the coefficient of l/(s — .Sq) in the Laurent series. The coefficient c_i is called the 
residue of f(s) at s = s o- 


Theorem A6.9 (The residue theorem) Let C be a simple, closed contour, and 
let f(s) be a complex-valued function that is single-valued and analytic on and within 
C, except at n isolated singular points, s = Si, s 2 , . . . ,s n . Then 

n 

f(s) ds = 27 ri aj, 

j = i 



where aj is the residue of f(s) at s = Sj . 


Proof Consider the closed contour C, shown in Figure A6.1, which is the contour 
C deformed into small circles around each singular point and joined to the original 
position of C by straight contours. Since / is analytic within C, 


f f(s)ds = 0, 

Jc 

by Cauchy’s theorem. The integrals along the straight parts of C cancel out in the 
limit as they approach each other, so we conclude that 

/ f(s) ds = f f(s) ds-'jr f f(s) ds = 0, 

J C J C • 2. J C 7 
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and hence 



C 



The residue theorem simply states that the integral around a simple, closed 
contour is equal to 2iri times the sum of the residues enclosed by C. This can 
be used in a variety of ways, in particular to evaluate the integral in the Laplace 
inversion formula, (6.5). 

Note that, for a function with a pole of order to at s = Sq, and Laurent series 
expansion 


/ 0 ) 


m (m— 1 ) 

(s-s 0 ) m + (s- So) 1 "- 1 


we can write 


/ 0 ) 


1 

(s-s 0 ) m 


9(s), 


where 


g(s) — a- m + — So) + • • • + a-i(s — sq)™ 1 + • • • , 


and hence 


i m— 1 , 


ds ™" 1 


(to — l)!a_i + to! ao(s — s o) + • • • 
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which shows that the residue of f(s) at s = s 0 is 


1 


< 2-1 = 


(m — 1)! ds 


— {(s- s 0 ) m f(s)} 


S = S 0 


In particular, at a simple pole, for which to = 1, a_i = (s — so)/(so), and at a 
double pole, for which to = 2, 


a - i = 


d_ 

ds 


(0-So) 2 /0)} 


S = S 0 


A6.4 Jordan’s Lemma 

Jordan’s lemma allows us to neglect an integral that often arises when evaluating 
inverse Laplace transforms of functions F(s), provided that |f (s)| — > 0 uniformly 
as |s| — > oo in the left half plane. 

Lemma A6.1 (Jordan) Let Cj be a semi-circular arc of radius R centred at s = Sq 
in the left, half of the complex s-plane. If F(s) is a complex-valued function of s 
and \F (s)| — > 0 uniformly on Cj as R — + oo, then 

lim [ e st F(s) ds = 0, 

JCj 

for t a positive constant. 



Fig. A6.2. sin# ^ 28 /n. 
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Proof On Cj, s = s 0 + Re 10 , so 


1= e st F(s) ds = 
JCj 


r 3ir/2 


7r/2 


0 tSQ-\-tRe Z[ 


and hence 


F(s 0 + Re w )iRe l ° d9 


dO 


= R 


P07T/Z 

| J| < R / e tS0+tRete F(s 0 + Re 10 ) 

J 7t/2 

F(s 0 + -Re ie )| d<9 


p3tt/2 


. e tso| I cos 0+iiR sin 0 j 


'7t/2 


= f?|e‘ s °| 


/*37r/2 


/7t/2 


e tKcos» + 


Igtsol / g— tRsinS' 

F (s 0 + Re i< y 0l+ ^) 

Jo 

V / 


after making the change of variable 0 = 0' + 7r/2. Since |F(s 0 + -Re* e )| — > 0 as 
i? — > oo, |F(so + -Re* e )| < F(i?) for some real-valued, positive function K(R), 
such that K(R) — > 0 as R — > oo. In addition, since sin0 ^ 20/7T, as shown in 
Figure A6.2, 


\I\<R\ e tS0 1 F(-R) [ e - 2tR0 '^ dSf = ^ A ^ (1 - e~ 2tR ) -> 0 as R -> oo. 


□ 


A6.5 Linear Ordinary Differential Equations in the Complex Plane 

Linear ordinary differential equations in the complex variable z = x + iy can be 
studied in much the same way as those in a real variable, although their solutions 
tend to have a richer structure due to the appearence of branch cuts, singularities 
and other features associated with the complex plane. In particular, the solution 
of linear second order equations by power series works in the same way as the real 
variable case, with the proviso that there is a circle of convergence rather than an 
interval of convergence. In this section, we will prove that power series solutions of 

w" + q{z)w' + r(z)w = 0, (A6.4) 

with zq(z) and z 2 r{z) analytic at z = 0, are convergent. Note that these are pre- 
cisely the conditions for Theorem 1.3, the convergence theorem for a real differential 
equation at a regular singular point. 

If we assume a series solution, 

OO 

W = Y: a nZ n+C , 

n = 0 
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and expansions 

zq(z) = q 0 + qiz H , z 2 r(z) = r 0 + r\z H , 

then the indicial equation is 

a 0 {c(c - 1) + cq 0 + r 0 } = 0, 


and, in general, 

n— 1 

a n f(c + ri) + ^2 a s {(c + s)q n - s + r n _ s } = 0, (A6.5) 

s=0 

where /(c) = c(c— 1) + cq^ + rQ. The indicial equation always has one root that will 
produce a well-defined series solution. We will proceed assuming that we are not 
dealing with a difficult case in which /(c + n) vanishes, when special care would be 
needed. 

We can rewrite (A6.5) in the form 

n— 1 

a n n (n + ci - c 2 ) = - ^ a s {(a + s) q n - s + r n - s } , (A6.6) 

s = o 

where ci and c 2 are the solutions of the indicial equation, and we have used the 
fact that Ci + C 2 = 1 — (7o- Since zq(z) and z 2 r(z) are analytic at z = 0, they each 
have Taylor expansions that converge in some disc centred on 2 = 0. Let R be the 
smaller of these two radii. Then, by Cauchy’s inequality (Theorem A6.7), there 
exist r and K = K(r ) such that 

K K 

\q n \ ^ r n ’ l r "l ^ /ft for r < R and n = 0, 1, 2, . . . . 

If we take the modulus of (A6.6), we can now see that 

ill | T - \ ' i | |ci| + s + 1 

n a n n + Ci — c 2 < K } a s — — . 

' ry>!L o 

s=0 

Writing |ci — c 2 | = A and |ci| = n, we can define a sequence of coefficients A n with 
\a n \ < A n using 

A n = \a n \ for 0 < n < A, 

n(n - A )A n = K (/U + s + 1) /r n ~ s for n ^ A. 

The definitions of A n _ 1 and A n for n ^ A show that 

n{n — A )A n — (n — l)(n — 1 — A) n 1 = K(a + n ) — -. 

r r 

If we now divide through by n{n — \)A n _i and let n — ■> oo, we find that A n /A n _i — > 
1/r. This shows that the radius of convergence of A n z n is r. But, from our 

definition, \a n \ < A n , so that the radius of convergence of Y^=o a nZ n l eas t r 
for all r < R. Hence a nZ n converges for \z\ < R as we claimed. 
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A Short Introduction to MATLAB 


MATLABf is a programming language and environment that is both powerful and 
easy to use. It contains built-in functions that perform most of the tasks that we 
need in order to illustrate the material in this book, for example, the numerical inte- 
gration of ordinary differential equations!, the graphical display of data, and much 
more. In particular, MATLAB was originally developed to provide an easy way to 
access programs that act on matrices, and contains extremely efficient routines, for 
example, to solve systems of linear equations and to extract the eigenvalues of a 
matrix§. 

In this appendix, our aim is to show complete newcomers to MATLAB enough 
for them to be able understand the material in the book that uses MATLAB. We 
introduce several other MATLAB functions in the main part of the text. For a 
more extensive guide to the power of MATLAB, see Higham and Higham (2000). 


A7.1 Getting Started 

MATLAB can be used in two ways. As we shall see, we can save functions, 
which have arguments and return values, and scripts, which are just sequences of 
MATLAB commands, as files, and call them from MATLAB. Alternatively, we can 
simply type commands into MATLAB at the command prompt (>>), and obtain 
results immediately. For example, MATLAB has all the functionality of a scientific 
calculator. 

>> (4+5-6)*14/5, exp(0.4), gamma(0 . 5) /sqrt (pi) 
ans = 8.4000 
ans = 1.4918 
ans = 1.0000 

Note that pi is a built-in approximation to n and gamma is the gamma function, 
r(x). The three expressions separated by commas are evaluated successively, and 
the calculated answers displayed. The default is to display five significant digits of 
the answer. This can be changed using the format command. For example, 

f MATrix LABoratory 

J We will often use the built-in subroutine ode45 to solve ordinary differential equations. An 
explanation of this is given in Section 9.3.4. 

§ See Trefethen and Bau (1997) for an introduction to numerical linear algebra that uses MAT- 
LAB extensively to illustrate the material. 
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>> format long 

>> (4+5-6) *14/5 , exp(0.4), gamma(0 . 5) /sqrt (pi) 
ans = 8.40000000000000 
ans = 1.49182469764127 
ans = 1.00000000000000 

MATLAB has comprehensive online documentation, which can be accessed with 
the command help. Typing help format at the command prompt lists the many 
other formats that are available. 

It should be noted that MATLAB, like any programming language, can only 
perform calculations with a finite precision, since numbers can only be represented 
digitally with a finite precision. In particular, each calculation is subject to an error 
of the order of eps, the machine precision. 

>> eps,sin(pi) 
ans = 2 . 2204e-016 
ans = 1 . 2246e-016 

This tells us that the machine precision is about 10~ 16 f, which corresponds to 
double precision arithmetic. A calculation of sin 7r = 0 leads to an error comparable 
in magnitude to eps. 


A7.2 Variables, Vectors and Matrices 

As well as being able to perform calculations directly, MATLAB allows the use of 
variables. Most programming languages require the user to declare the type of 
each variable, for example a double precision scalar or a single precision complex 
array. This is not necessary in MATLAB, as storage is allocated as it is needed. 
For example, 

» A = 1, B = 1:3, C = 0:0.1:0.45, D = linspace(l,2,5) ’ . . . 

E = [1 2 3; 4 5 6] , F = [1+i; 2+2*i] 

A = 1 

B = 1 2 3 

C = 0 0.1000 0.2000 0.3000 0.4000 

D = 1.0000 
1.2500 
1.5000 
1.7500 
2.0000 

E = 1 2 3 

4 5 6 

F = 1.0000 + l.OOOOi 
2.0000 + 2 . OOOOi 


f The precise value of eps varies depending upon the system on which MATLAB is installed. 
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Note that . . . allows the current command to overflow onto a new line. We also 
need to remember that MATLAB variables are case-sensitive. For example, X 
and x denote distinct variables. 

A is a 1 x 1 matrix with the single, real entry, 1. Of course, we can just treat this 
as a scalar. 

B is a 1 x 3 matrix (a row vector). The colon notation, a:b, just denotes a vector 
with entries running from a to b at unit intervals. 

C is another row vector. In the notation a:c:b, c denotes the spacing as the 
vector runs from a to b. 

D is a 5 x 1 matrix (a column vector). The function linspace(a,b,n) generates 
a row vector running from a to b with n evenly spaced points. Note that A’ is 
the transpose of A when A is real, and its adjoint if A is complex. 

E is a 2 x 3 matrix. The semicolon is used to denote a new row. 

F is a complex column vector. Note that i is the square root of —1. 

MATLAB has a rich variety of functions that operate upon matrices (type help 
elmat), which we do not really need for the purposes of this book. However, as an 
example, matrix multiplication takes the obvious form, 

» E ’ *F 

ans = 9.0000 + 9.0000i 

12.0000 + 12 . OOOOi 

15.0000 + 15. OOOOi 
» E*F 

??? Error using ==> * 

Inner matrix dimensions must agree. 

Remember, an m\ x ?7i matrix can only be multiplied by an m 2 x 712 matrix if 

Tl 1 = 777-2 • 

It is also possible to operate on matrices element by element, for example 
» B . *C 

ans = 0 0.2000 0.6000 1.2000 2.0000 

This takes each element of B and multiplies it by the corresponding element of C. 
The matrices B and C must be of the same size. The use of a full stop in, as another 
example, B . /C, always denotes element by element calculation. Matrix addition 
and subtraction take the obvious form. For example, 

>>0+0’ 

ans = 1.0000 1.3500 1.7000 2.0500 2.4000 

» C - D 

??? Error using ==> - 
Matrix dimensions must agree. 

The one exception to the rule that matrices must have the same dimensions comes 
when we want to add a scalar to each element of a matrix. For example, 
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» C + 1 

ans = 1.0000 1.1000 1.2000 1.3000 1.4000 

This adds 1 to each element of C, even though 1 is a scalar. 

It is often useful to be able to extract rows or columns of a matrix. For example, 

» E(: ,3), E(2, : ) 
ans = 3 
6 

ans = 4 5 6 

extracts the third column and second row of E. 

Most of MATLAB’s built-in functions can also take matrix arguments. For 
example 

>> sin(C) 

ans = 0 0.0998 0.1987 0.2955 0.3894 

» E. ~2 

ans = 1 4 9 

16 25 36 


A7.3 User-Defined Functions 

We can add to MATLAB’s set of built-in functions by defining our own functions. 
For example, if we save the commands 

/ \ 
function a = x2(x) 

a = x. ~ 2 ; 

| 

in a file named x2 . m, we will be able to access this function, which simply squares 
each element of x, from the command prompt. Note that the semicolon suppresses 
the output of a. For example, 

» x2 (0 : -0 . 1 : -0 . 5) 

ans = 0 0.0100 0.0400 0.0900 0.1600 0.2500 

It is always prudent to write functions in such a way that they can take a matrix 
argument. For example, if we had written x2 using a = x~2, MATLAB would 
return an error unless x were a square matrix, when the result would be the matrix 
x*x. 


A7.4 Graphics 

One of the most useful features of MATLAB is its wide range of different methods 
of displaying data graphically (type help graphics). The most useful of these is 
the plot command. The basic idea is that plot(x,y) plots the data contained in 
the vector y against that contained in the vector x, which must, of course have the 
same dimensions. For example, 
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>> x = linspace (1 , 10, 500) ; y = log (gamma (x) ) ; plot(x,y) 

produces a plot of the logarithm of the gamma function with 1 ^ x ^ 10. We can 
add axis labels and titles with 

>> xlabel(’x’), ylabel(’log \Gamma(x)’) 

>> title(’The logarithm of the gamma function’) 

This produces the plot shown in Figure A7.1. Note that MATLAB automatically 
picks an appropriate range for the y axis. We can override this using, for exam- 
ple, YLim( [a b] ), to reset the limits of the y axis. There are many other ways of 
controlling the axes (type help axis). 


The logarithm of the gamma function 



Fig. A7.1. A plot of logr(a:), produced by MATLAB. 


The command plot can produce graphs of more than one function. For example, 
plot(xl,yl,x2,y2,x3,y3) produces three lines in the obvious way. By default, 
the lines are produced in different colours to distinguish between them, and the 
command legend (’ label 1 ’ , ’ label 2 1 , ’ label 3 ’ ) adds a legend that names 
each line. The style of each line can be controlled, and individual points plotted if 
necessary. For example, plot (xl , y 1 , ’ — ’ , x2 , y2 , ’ x ’ , x3 , y3 , ’ o- ’ ) plots the first 
data set as a dashed line, the second as discrete crosses, and the third as a solid 
line with circles at the discrete data points. For a comprehensive list of options, 
type help plot. 

As we shall see in the main part of the book, there is also an easy way of 
plotting functions to get an idea of what they look like. For example, ezplot (Osin) 
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produces a plot of sin a;. We will also use the command ezmesh, which produces 
a mesh plot of a function of two variables (see Section 2.6.1 for an explanation). 
The quantity Osin is called a function handle. These allow functions to take 
other functions as arguments, by passing the function handle as a parameter. This 
includes functions supplied by the user, so that, for example, ezplot (0x2) plots 
the function x2 that we defined earlier. 

A7.5 Programming in MATLAB 

All of the control structures that you would expect a programming language to 
have are available in MATLAB. The most commonly used of these are FOR loops 
and the IF, ELSEIF structure. 


FOR Loops 

There are many examples of FOR loops in the main text. The basic syntax is 

for c = x 

statements 

end 


The statements are executed with c taking as its value the successive columns of 
x. For example, if we save the script 



to a file sinkx.m, and then type sinkx at the command prompt, the commands 
in sinkx.m are executed. This is a MATLAB script. Note that the indentation of 
the lines is not necessary, but makes the script more readable. The editor supplied 
with MATLAB formats scripts in this way automatically. The command pause 
waits for the user to hit any key before continuing, so this script successively plots 
sin a:, sin 2a;, sin 3a: and sin 4a; for 0 ^ x ^ 27t. 

As a general rule, if a script needs to execute quickly, FOR loops should be 
avoided where possible, in favour of matrix and vector operations. For example, 

>> tic, A = (1 : 50000) . "2 ; toe 
elapsed_time = 0.0100 

>> tic, for k = 1:50000, A(k) = k~2; end, toe 
elapsed_time = 0.4010 

The command tic sets an internal clock to zero, and the variable toe contains 
the time elapsed, in seconds, since the last tic. We can see that allocating fc 2 to 
A(k), the k th entry of A, using element by element squaring of 1:50000, known as 
vectorizing the function, executes about forty times faster than performing the 
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same task using a FOR loop. For an in-depth discussion of vectorization, see Van 
Loan (1997). 

Another way of speeding up a MATLAB script is to preallocate the array. In 
the above sequence, the array A already exists, and is of the correct size when the 
FOR loop executes. Consider 

»A = [] ; 

>>tic, for k = 1:50000, A(k) = k~2; end, toe 
elapsed_time = 94.1160 

If we set A to be the empty array, and then allocate k 2 to A(k) , expecting MATLAB 
to successively increase the size of A at each iteration of the loop, we can see that 
the execution time becomes huge compared with that when A is already known to 
be a vector of length 50000. 

The IF, ELSEIF Structure 
The basic syntax for an IF, ELSEIF statement is 

if expression 
statements 
elseif expression 
statements 
elseif expression 
statements 


else 

statements 

end 

For example, consider the function 


function f = fl(x) 
f = zeros (size (x) ) ; 
for k = l:length(x) 

if (x(k)<0) I (x(k)>3) 


disp(’x out of range’) 
elseif x(k)<l 

f (k) = x(k)~2; 
elseif x(k)<2 

f (k) = cos (x(k) ) ; 


else 


f (k) = exp(-x(k)) ; 


end 
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This evaluates the function (A2.1), which is plotted in Figure A2.2, at the points 
in the vector x, and displays a warning if any of the elements of x lie outside the 
range [0,3], where the function is defined. The logical function or is denoted by 
I in MATLAB. The function length (x) calculates the length of a vector x. Note 
that we initialize f as a vector of zeros with the same dimensions as x using f = 
zeros (size (x) ) . As we have already discussed, this preallocation of the matrix 
significantly speeds up the execution of the function. In fact, although this is a 
transparent way of programming the function, it is still not as efficient as it could 
be. Consider the function 


function f = f2(x) 

if ( (x<0) I (x>3) ) == zeros(size(x)) 

f = ( (x>=0)&(x<l) ) . *x. ~2 + ( (x>=l)&(x<2) ) . *cos (x) 
+ ( (x>=2) & (x<=3) ) . *exp (-x) ; 

else 

disp(’x out of range’) 

end 


Note that ==, not =, is used to test the equality of two matrices. The vector (x>=0) 
is the same size as x, and contains ones where the corresponding element of x is 
positive or zero, and zeros elsewhere. The operator & acts in the obvious way as 
the logical function and. The function f2 evaluates /( x) in a vectorized manner, 
avoiding the use of a FOR loop. Now consider 

>> tic, y = fl(x); toe 
elapsed_time = 1.4320 
>> tic, y = f2(x); toe 
elapsed_time = 0.1300 

It is clear that the vectorized function f2 evaluates (A2.1) far more efficiently than 
f 1 , which uses a FOR loop and an IF, ELSEIF statement. 
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product rule, 5, 425 

quantum mechanics, 118 
hydrogen atom, 120 
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Smale-Birkhoff theorem, 475 
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structurally unstable, 389 
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stable node, 227 
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state variables, 418 
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transformations 
identity, 257 
infinitesimal, 258 
inverse, 257 
magnification, 260 
rotation, 261 
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travelling wave solution, 375, 409 
triangle inequality, 208, 497 
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unstable manifold 
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unstable node, 227 
unstable spiral, focus, 228 

van der Pol Oscillator, 329 

Van Dyke’s matching principle, 311, 366E 


Vector space, 9, 495 
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wave equation, 82 
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